Wednesday, August 15, 2012

Types

Fundamentally, Triceps is a language, even though it is piggybacking on the other languages. And like in pretty much any programming language, pretty much anything in it has a type. Only the tip of that type system is exposed in the Perl API, as the RowType and TableType. But the C++ API has the whole depth. The types go all the way down to the simple types of the fields.

The classes for types are generally defined in the subdirectory type/. The class Type, defiined in type/Type.h is the common base class.

First, every kind of type has its entry in the enum TypeId:

        TT_VOID, // no value
        TT_UINT8, // unsigned 8-bit integer (byte)
        TT_INT32, // 32-bit integer
        TT_INT64,
        TT_FLOAT64, // 64-bit floating-point, what C calls "double"
        TT_STRING, // a string: a special kind of byte array
        TT_ROW, // a row of a table
        TT_RH, // row handle: item through which all indexes in the table own a row
        TT_TABLE, // data store of rows (AKA "window")
        TT_INDEX, // a table contains one or more indexes for its rows
        TT_AGGREGATOR, // user piece of code that does aggregation on the indexes
        TT_ROWSET, // an ordered set of rows

TT_ROWSET is something added in version 1.1.0, it will be described later. TT_VOID is pretty much a placeholder, in case if a void type would be needed later. The TypeId gets hardcoded in the constructor of every Type sub-class. It can be gotten back with the method

TypeId getTypeId() const;

Another method finds out if the type is the simple type of a field:

bool isSimple() const;

It would be true for the types of ids TT_VOID, TT_UINT8,  TT_INT32,  TT_INT64, TT_FLOAT64, TT_STRING.


Generally, you can check the TypeId and then cast the Type pointer to its subclass. All the simple types have the common base class SimpleType, which will be described in a moment.


There is also a static Type method that finds a simple type object by name (like "int32", "string" etc.):


static Onceref<const SimpleType> findSimpleType(const char *name);


Basically, there is not a whole lot of point in having lots of copies of the simple type objects (though if you want, you can). So there is one common copy of each simple type that can be found by name. If the type is known when you compile your C++ program, you can even avoid the look-up and refer to these objects directly.

    static Autoref<const SimpleType> r_void;
    static Autoref<const SimpleType> r_uint8;
    static Autoref<const SimpleType> r_int32;
    static Autoref<const SimpleType> r_int64;
    static Autoref<const SimpleType> r_float64;
    static Autoref<const SimpleType> r_string;

The type construction may cause errors. It is usually done either by a single constructor with all the needed arguments, or a simple constructor, then additional methods to add the information in bits in pieces, then an initialization method. In both cases there is a problem of how to report the errors. They're not easy to return from a constructor and a pain to check in the bit-by-bit construction.

Instead the error information gets internally collected in an Errors object, and can be read after the construction and/or initialization is completed:

virtual Erref getErrors() const;

A type with errors may not be used for anything other than reading the errors.

The rest of the common virtual methods has to do with the type comparison and print-outs. The comparison methods essentially check if two type objects are aliases for each other:

virtual bool equals(const Type *t) const;
virtual bool match(const Type *t) const;

The concept has been previously described with the Perl API. The equal types are exactly the same. The matching types are the same except for the names of their elements, so it's generally safe to pass the values between these types.

equals() is also available as operator==.

The print methods create a string representation of a type, used mostly for the error messages. There is no method to parse this string representation back, at least yet.

virtual void printTo(string &res, const string &indent = "", const string &subindent = "  ") const = 0;
string print(const string &indent = "", const string &subindent = "  ") const;

printTo() appends the information to an existing string. print() returns a new string with the message. print() is a wrapper around printTo() that creates an empty string, does printTo() into it and returns it.

The printing is normally done in a multi-line format, nicely indented, and the arguments indent and subindent define the initial indent level and the additional indentation for every level.

There is also a way to print everything in one line: pass the special constant NOINDENT (defined in common/StringUtil.h) in the argument indent. This is similar to using an undef for the same purpose in the Perl API.

The definitions of all the types are collected together in type/AllTypes.h.

No comments:

Post a Comment