Saturday, May 11, 2013

on the export of table types

I've made an example for the export of the table types between threads but it didn't come out well. It has turned out to not particularly need this feature, and came out contrived and ugly. I'm working on a better example now, so in the meantime I want to tell more about the subject matter.

As I've said before, the limitation of exporting the table types between the threads is in keeping the involved Perl code snippets as source code. Their support for the sorted and ordered indexes has been already described, and I've also mentioned the aggregators. I've done this support for the basic aggregators too, with a side effect that the fatal errors in both the index and aggregator code snippets are now propagated much more nicely and come out as the table operations confessing. However when it came to the SimpleAggregator, I've found that I can't just do it as-is, the missing piece of the puzzle is the aggregator initialization routine that would run at the table type initialization time. It's a nice feature to have overall but I'm trying to cut a release here and push everything non-essential past it.

Fortunately, some thinking had showed that this feature is not really needed. There usually just isn't any need to export a table type with aggregators. Moreover, there is a need to export the table types with many elements stripped. What is to be stripped and why?

The most central part of the table type is its primary index. It defines how the data gets organized. And then the secondary indexes and aggregators perform the computations from the data in the table. The tables can not be shared between threads, and thus the way to copy a table between the threads is to export the table type and send the data, and let the other thread construct a copy of the table from that. But the table created in another thread really needs only the base data organization. If it does any computations on that data, that would be its own computations, different than the ones in the exporting thread. So all it needs to get is the basic table type with the primary index, very rarely some secondary indexes, and pretty much never the aggregators.

The way to get such a stripped table type with only the fundamentally important parts is:

$tabtype_fundamental = $tabtype->copyFundamental();

That copies the row type and the primary index (the whole path to the first leaf index type) and leaves alone the rest. All the aggregators on all the indexes, even on the primary one, are not included in the copy. In the context of the full nexus making it can look like

$facet = $owner->makeNexus(
    name => "data"
    labels => [ @labels ],
    tableTypes => [
         mytable => $mytable->getType()->copyFundamental(),
    ],
    import => "writer",
);

In case if more index types need to be included, they can be specified by path in the arguments of copyFundamental():

$tabtype_fundamental = $tabtype->copyFundamental(
    [ "byDate", "byAddress", "fifo" ],
    [ "byDate", "byPriority", "fifo" ],


);

The paths may overlap, as shown here, and the matching subtrees will be copied correctly, still properly overlapping in the result. There is also a special syntax:

$tabtype_fundamental = $tabtype->copyFundamental(
    [ "secondary", "+" ],

);

The "+" in the path means "do the path to the first leaf index of that subtree" and saves the necessity to write out the whole path.


Finally, what if you don't want to include the original primary index at all? You can use the string "NO_FIRST_LEAF" as the first argument. That would skip it. You can still include it by using its explicit path, possibly at the other position.


For example, suppose that you have a table type with two top-level indexes, "first" is the primary index and "second" as secondary, and make a copy:


$tabtype_fundamental = $tabtype->copyFundamental(
     "NO_FIRST_LEAF",
    [ "second", "+" ],

    [ "first", "+" ],

);

In the copied table type the index "second" becomes primary and "first" secondary.

No comments:

Post a Comment