Monday, December 29, 2014

formatting the source code snippets

I've got to printing the snippets of the source code in the error messages when the said code fails to compile and realized that it could be printed better. So I wrote a couple of helper functions to print it better.

First of all, it could be indented better, to match the indenting of the rest of the error message.

Second (but connected), it could be left-aligned better:  the code snippets tend to have the extra spacing on the left.

And third, it could use the line numbers, to make the error location easier to find.

The magic is done with two functions, Triceps::Code::alignsrc() and Triceps::Code::numalign(). They work in the exact same way, have the same arguments, only numalign() adds the line numbers while alignsrc() doesn't.

Here is an example of use:

        confess "$myname: error in compilation of the generated function:\n  $@function text:\n"
        . Triceps::Code::numalign($gencode, "  ") . "\n";

It can produce an error message like this (with a deliberately introduced syntax error):

Triceps::Fields::makeTranslation: error in compilation of the generated function:
  syntax error at (eval 27) line 13, near "})
"
function text:
     2 sub { # (@rows)
     3   use strict;
     4   use Carp;
     5   confess "template internal error in Triceps::Fields::makeTranslation: result translation expected 1 row args, received " . ($#_+1)
     6     unless ($#_ == 0);
     7   # $result_rt comes at compile time from Triceps::Fields::makeTranslation
     8   return $result_rt->makeRowArray(
     9     $_[0]->get("one"),
    10     $_[0]->get("two"),
    11   );
    12 })
 at /home/babkin/w/triceps/trunk/perl/Triceps/blib/lib/Triceps/Fields.pm line 219
    Triceps::Fields::makeTranslation('rowTypes', 'ARRAY(0x2943cb0)', 'filterPairs', 'ARRAY(0x2943b30)', '_simulateCodeError', 1) called at t/Fields.t line 205
    eval {...} called at t/Fields.t line 204


The first argument of alignsrc() and numalign() is the code snippet string to be aligned. The following will be done to it:

  • The empty lines at the front will be removed. numalign() is smart enough to take the removed lines into account and adjust the numbering. That's why the numbering in the error message shown above starts with 2. You can also get the number of the removed lines afterwards, from the global variable $Triceps::Code::align_removed_lines.
  • The \n at the end of the snippet will be chomped. But only one, the rest of the empty lines at the end will be left alone.
  • Then the "baseline" indenting of the code will be determined by looking at the first three and last two lines. The shortest non-empty indenting will be taken as the baseline. If some lines examined start with spaces and some start with tabs, the lines starting with tabs will be preferred as the baseline indenting.
  • The baseline indenting will be removed from the front of all lines. If some lines in the middle of the code have a shorter indenting, they will be left unchanged.
  • The tabs will be replaced by two spaces each. If you prefer a different replacement, you can specify it as the third argument of the function.
  • In numalign() the line numbers will be prepended to the lines.
  • The indenting from the second argument of the function will be prepended to the lines.

The second argument  contains the new indenting that allows to align the code nicely with the rest of the error message. Technically, it's optional, and will default to an empty string.

The third argument is optional and allows to provide an alternative replacement to the tab characters in the code. If it's an empty string, it will revert to the default two spaces "  ".  To keep the tabs unchanged, specify it as "\t".

Friday, December 26, 2014

error reporting in the templates

When writing the Triceps templates, it's always good to make them report any usage errors in the terms of the template (though the extra detail doesn't hurt either). That is, if a template builds a construction out of the lower-level primitives, and one of these primitives fail, the good approach is to not just pass through the error from the primitive but wrap it into a high-level explanation.

This is easy to do if the primitives report the errors by returning them directly, as Triceps did in the version 1. Just check for the error in the result, and if an error is found, add the high-level explanation and return it further.

It becomes more difficult when the errors are reported like the exceptions, which means in Perl by die() or confess(). The basic handling is easy, there is just no need to do anything to let the exception propagate up, but adding the extra information becomes difficult. First, you've got to explicitly check for these errors by catching them with eval() (which is more difficult than checking for the errors returned directly), and only then can you add the extra information and re-throw. And then there is this pesky problem of the stack traces: if the re-throw uses confess(), it will likely add a duplicate of at least a part of the stack trace that came with the underlying error, and if it uses die(), the stack trace might be incomplete since the native XS code includes the stack trace only to the nearest eval() to prevent the same problem when unrolling the stacks mixed between Perl and Triceps scheduling.

Because of this, some of the template error reporting got worse in Triceps 2.0.

Well, I've finally come up with the solution. The solution is not even limited to Triceps, it can be used with any kind of Perl programs. Here is a small example of how this solution is used, from Fields::makeTranslation():

    my $result_rt = Triceps::wrapfess
        "$myname: Invalid result row type specification:",
        sub { Triceps::RowType->new(@rowdef); };

The function Triceps::wrapfess() takes care of wrapping the confessions. It's very much like the try/catch, only it has the hardcoded catch logic that adds the extra error information and then re-throws the exception.

Its first argument is the error message that describes the high-level problem. This message will get prepended to the error report when the error propagates up (and the original error message will get a bit of extra indenting, to nest under that high-level explanation).

The second argument is the code that might throw an error, like the try-block. The result from that block gets passed through as the result of wrapfess().

The full error message might look like this:

Triceps::Fields::makeTranslation: Invalid result row type specification:
  Triceps::RowType::new: incorrect specification:
    duplicate field name 'f1' for fields 3 and 2
    duplicate field name 'f2' for fields 4 and 1
  Triceps::RowType::new: The specification was: {
    f2 => int32[]
    f1 => string
    f1 => string
    f2 => float64[]
  } at blib/lib/Triceps/Fields.pm line 209.
    Triceps::Fields::__ANON__ called at blib/lib/Triceps.pm line 192
    Triceps::wrapfess('Triceps::Fields::makeTranslation: Invalid result row type spe...', 'CODE(0x1c531e0)') called at blib/lib/Triceps/Fields.pm line 209
    Triceps::Fields::makeTranslation('rowTypes', 'ARRAY(0x1c533d8)', 'filterPairs', 'ARRAY(0x1c53468)') called at t/Fields.t line 186
    eval {...} called at t/Fields.t line 185

It contains both the high-level and the detailed description of the error, and the stack trace.

The stack trace doesn't get indented, no matter how many times the message gets wrapped. wrapfess() uses a slightly dirty trick for that: it assumes that the error messages are indented by the spaces while the stack trace from confess() is indented by a single tab character. So the extra spaces of indenting are added only to the lines that don't start with a tab.

Note also that even though wrapfess() uses eval(), there is no eval above it in the stack trace. That's the other part of the magic: since that eval is not meaningful, it gets cut from the stack trace, and wrapfess() also uses it to find its own place in the stack trace, the point from which a simple re-confession would dump the duplicate of the stack. So it cuts the eval and everything under it in the original stack trace, and then does its own confession, inserting the stack trace again. This works very well for the traces thrown by the XS code, which actually doesn't write anything below that eval; wrapfess() then adds the missing part of the stack.

Wrapfess() can do a bit more. Its first argument may be another code reference that generates the error message on the fly:

    my $result_rt = Triceps::wrapfess sub {
            "$myname: Invalid result row type specification:"
        },
        sub { Triceps::RowType->new(@rowdef); };

In this small example it's silly but if the error diagnostics is complicated and requires some complicated printing of the data structures, it will be called only if the error actually occurs, and the normal code path will avoid the extra overhead.

It gets even more flexible: the first argument of wrapfess() might also be a reference to a scalar variable that contains either a string or a code reference. I'm not sure yet if it will be very useful but it was easy to implement. The idea there is that it allows to write only one wrapfess() call and then change the error messages from inside the second argument, providing different error reports for its different sections. Something like this:

    my $eref;
    return Triceps::wrapfess \$eref,
        sub {
          $eref = "Bad argument foo";
          buildTemplateFromFoo();
          $eref = sub {
            my $bardump = $argbar->dump();
            $bardump =~ s/^/    /mg;
            return "Bad argument bar:\n  bar value is:\n$bardump";
           }
          buildTemplateFromBar();
          ...

       };

It might be too flexible, we'll see how it works.

Internally, wrapfess() uses the function Triceps::nestfess() to re-throw the error. Nestfess() can also be used directly, like this:

eval {
  buildTemplatePart();
};
if ($@) {

  Triceps::nestfess("High-level error message", $@);
}

The first argument is the high-level descriptive message to prepend, the second argument is the original error caught by eval. Nestfess() is responsible for all the smart processing of the indenting and stack traces, wrapfess() is really just a bit of syntactic sugar on top of it.

Saturday, December 20, 2014

nested record and ORM

Some more thoughts on the subject. First of all, why should anyone care about the nested records?

Nowadays people are used to the relational model. But next to all of this sits Google with its protobufs. It has published the information about protobufs, and about some of its systems that use protobufs but I think there is not enough appreciation of how much the protobufs mean for Google. If you look at the published information, you'll realize that everything at Google, totally everything, is based on protobufs. They're used as the arguments of the RPC calls and as the returned values, the map-reduce is built on them, as well as all kinds of the no-SQL DBMS and a few of SQL DBMS. And there are good reasons for why they're used so much. Well, for the no-SQL databases it's a no-brainer: if all it can do is store the key-value pairs, at least the values must be capable of representing complexity. To understand the other reasons, let's look at the difference between the nested records and the Object Relation Managers.

Entirely coincidentally, I've encountered recently a reference to the article at http://blogs.tedneward.com/2006/06/26/The%2BVietnam%2BOf%2BComputer%2BScience.aspx, on the subject of ORM. It's an interesting viewpoint though the conclusions don't seem quite right to me, as well as a few of the pre-suppositions.

The idea that the object model has the concept of identity and the relational model doesn't, doesn't seem right. In reality the primary keys are used all over the real databases. And the primary keys are exactly that, the identity. Even more so if the primary key is not a part of data but an auto-generated unique value. Here we can go into the discussions about "but it's just a hack, not a part of the theory" but think about this: these auto-generated unique keys are the bread and butter of the data normalization, which certainly is the part of the theory. The point of normalization is to get rid of the duplicate data in the rows and move it to a separate table where it will exist as a single copy, with multiple short references to it from the original table. But so are the references between the objects, multiple objects may refer to the same subordinate object. And the reference to that object's address (its identity) is just the same thing as the references to that unique key in the subordinate table.

So, what ORM does is map one-to-one the objects and records, and maps the references between the objects by address to these references between the records by the unique id values. The joins are then the way to navigate the references between the objects.

The nested records in protobufs (or their equivalents) have a different meaning and are used differently. The whole point of a nested record is that it represents a self-contained unit of data. This data is all stored together because it's normally used together. It's obvious to see when the protobufs are used for RPC but it works the same when the protobufs are used in a database. Obviously, not everything in the world can be covered by this usage and obviously there are downsides. An obvious downside is that a nested record  in a protobuf can't be modified piecemeal: if you need to change it, you have to build a whole new record and put it back in place. And obviously the data is stored by value, not by reference, so the duplicate data has to be duplicated. But for the situations where the self-containment is real important, they work great. As in all the massively-parallel processing at Google. Each nested record has enough information in it to be processed separately from the others. Again, not everything in the world can be expressed like this, but what can, benefits tremendously from the nested records.

Now I think I can formulate, what is the difference between the ORM and the nested records: the ORMs connect the objects by reference while the nested records bunch up the data by value. Different uses, different trade-offs, different implementations. (Though for the in-memory implementations the boundary is more blurred).

And I think I've found the explanation to why they say that the Oracle's nested records are slower than the manual joins, even though they amount to joins, the the ORM-like way. Well, or course it could just be a limitation of the query optimizer: if the optimizer runs first and only then the nesting is expanded, it obviously would not be as optimal. But that should be easy to fix, just expand the nesting and then run the optimizer, so this is unlikely to be the problem. A more fundamental problem seems to be that the records are sent into the database with the data by value, and then they get broken up into the flat tables and stored with connections by reference. It successfully combines the data duplication of by-value and the cost of the extra joins from by-reference. Both kinds of shortcomings rolled together.

Coming back to that point that not everything in the world can be represented as the nested records in protobufs. This means that the references between the tables (or objects, if you prefer to think about them that way) are still needed, and the nested records still need to use them. Which means fitting them into the relational model. And that's where my whitepaper comes in: how to represent the nested records in the relational model. They can be logically dis-joined into the "underlying tables" and the relational operations can be applied to them in the usual way.

Which can also solve problem if the representation of the object inheritance and polymorphism in ORMs, described in the article I listed above. Well, representing the polymorphic data as nested records is fairly straightforward as long as the records contain the indication of subtypes.  And as I've shown in the whitepaper, mapping the nested records to the relational use is also straightforward.  Put them together and you have the solution.

Thursday, December 18, 2014

more on nested records (and Oracle)

As people have pointed out to me, Oracle has the concept of "nested tables" and "variable arrays" (VARRAY). Since at least the version 9i, a very long time ago.

The nested tables have a difference in how they store the data: they break out the repeating sections into a separate table and automatically create the id fields to link them together. So the select just does an implicit join. It's like an embedded ORM (by the way, there is also a concept of references to rows, also apparently implemented in an ORM way). Strangely, they say that the nested tables are slower than the usual joins. Not sure, why.

The VARRAYs are more like the real nested structures, they store the data directly in the row, and they preserve the order of nested elements. However the maximal size of VARRAYs has to be declared in advance.

Obviously, the nested tables (the ORM version) would be much faster on updates, they can change the nested parts very easily without changing the rest of the records. And also faster on scanning of the non-repeating parts.

The version with the repeated parts stored directly in the records, such as protobufs or varrays should be faster on reading these repeated parts.

The syntax for accessing the nested parts is like this:

select
  upper.key,
  nested.*
from
  main_table upper,
  table(upper.nested) nested;

So you can select the nested fields as a single field of a record type or you can flatten it as shown above. And the joins within the same top-level records should be possible to express in the same syntax like this:

from
  main_table upper,
  table(upper.nested1) nested1,
  table(upper.nested2) nested2;

There also are the examples of deleting the nested parts, like:

delete from table (
  select nested from upper where key = 1234
) nested
where
  nested.field = 5678;

Or even by the whole value of the nested part.

And by the same example, you can insert a nested part into the record, or update one.

But the part that seems to be missing is the un-flattening. Can you select data from one table with nested data, re-format the nested parts, create different nested records and either return them from select or insert them into another table? I haven't found any examples of that. It looks like the nested records aren't the first-class citizen in Oracle either.

Friday, December 12, 2014

Relational Operations on the Nested Data

I wrote this little whitepaper that I don't know where to put, so I'll just put it here. It gives some interesting ideas for the future development of Triceps. The PDF version can also be found at http://web.newsguy.com/sab123/papers/rel_nested.pdf

It might also help to read first my explanation of why the nested data is important and how it differs from the Object Relations Managers at http://babkin-cep.blogspot.com/2014/12/nested-record-and-orm.html.


Relational Operations on the Nested Data

Sergey A. Babkin

Abstract

This paper shows a way to integrate the RDBMS and the nested records to the mutual benefit. The nested records dovetail with the relational model to cover the weaknesses of the later, such as the representation of the multi-dimensional 1:M joins, and allow an optimization of other kinds of jons.

Summary

The use of the nested data structures seems to be growing in popularity for the distributed data processing: Google has Protobufs, Hadoop and Bing have their own formats, XML has been around for a long time, and so on. Among other things, they are used as records in the non-relational and kind-of-relational DBMS. While at Google, I've used the Dremel system that implements a read-only columnar database over the tables of nested records in the protobuf format, and while it was convenient in some ways, in the other ways the operations in it have felt unnecessarily difficult. To be honest, by now I can't even formulate why exactly it felt so: it's been a while since I've used Dremel, and by now I forgot the details, I have to guess from the logical considerations in the published description in [Dremel] (http://research.google.com/pubs/pub37217.html). The theory behind it is described very nicely in [Nested] (http://peqnp.blogspot.com/2013/01/nested-relational-algebra.html) and in the documents it refers to. It was surprising to find out that their model is based on a paper from 1988 [Report] (http://www.cs.indiana.edu/cgi-bin/techreports/TRNNN.cgi?trnum=TR259), and apparently nothing much had happened since then. Everything I've found on the subject seems to be based on [Report].

In this paper I want to show that:
1.       There is more to the nested data structures than described in [Report].
2.       There is a straightforward way to map the nested data to the relational operations and take advantage of its locality.
3.       A common record-based (as opposed to columnar, such as Dremel) RDBMS can be adapted to take advantage of the nested records with a small set of changes.

To forestall a typical question, this is not a proposal for a way of doing ORM that translates from objects to flat tables, and is not in any way related to ORM. It is about the adoption of the nested records into the relational theory.

Nested data is multi-dimensional

The [Report] deals with the nesting but all the nesting discussed there is one-dimensional. The schema of an example from there can be written out in a pseudo-code similar to Google protobufs in Listing 1. Even though the nesting is two-deep, it goes in only one direction, growing in one dimension.

struct client {

  string name;

  string address;

  repeated struct company {

    string name;

    repeated struct share {

      float price;

      string date;

      int quantity;

    } shares;

  } companies;

};
Listing 1. One-dimensional nested data structure.

But the real-world nested records tend to have multiple repeated elements. For example, consider a structure describing a theatrical company. It might list the pieces produced by this company in the current season and the actors in the company, as shown in Listing 2. This structure has two separate repeating fields going in parallel at the same level, expanding in two dimensions.

struct company {

  string name;

  repeated struct piece { // dimension 1

    string name;

    string start_date;

    string end_date;

  } pieces;

  repeated struct actor { // dimension 2

    string name;

    string bio;

  } actors;

};
Listing 2. Two-dimensional nested data structure.

For another example, the [Dremel] paper describes a structure shown in Listing 3.

struct  Document {

  int64 DocId;

  struct {

    repeated int64 Backward; // dimension 1

    repeated int64 Forward; // dimension 2

  } Links;

  repeated struct { // dimension 3

    repeated struct Language {

      string Code;

      string Country;

    }

    string Url;

  } Names;

};
Listing 3. Three-dimensional nested data structure from [Dremel] paper.

This example has four repeated elements in three dimensions, one of the dimensions nesting two-deep.

Or there might be further cross-references in the nested data, for example, the theatrical company description may have a list of actors for each piece, as show in Listing 4. Rather than repeating actor's biography in each piece, it can be found by name, descending down the other dimension. Note that the field piece. actor_name is also repeated, creating a nesting of two levels deep.

struct company {

  string name;

  repeated struct piece { // dimension 1

    string name;

    string start_date;

    string end_date;

    repeated string actor_name; // cross-reference to dimension 2

  } pieces;

  repeated struct actor { // dimension 2

    string name;

    string bio;

  } actors;

};
Listing 4. Two-dimensional nested data with nested cross-references.

The [Report] doesn't provide a solution for the multi-dimensional data. There even are papers like [NoGood] (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.89.2966&rep=rep1&type=pdf )that argue that the nested structures are not a good idea to start with, citing in particular the cross-references like shown above between the pieces and the actors. But with a little different approach, the nested structures are quite useful.


Nested data is the result of a Join


People argue everywhere, including Wikipedia, that the nested data saves the need for joins. And it looks obvious, yet I haven't seen this fact stated before in another form, looking from a different standpoint:

A nested structure can be thought of as the result of a join.

The abovementioned theatrical company data from Listing 4 can be represented in the pure relational way as multiple joined tables (shown here in an SQL-like pseudo-code), as shown in Listing 5.


create table company (

  company_id integer,

  name varchar

);



create table pieces (

  company_id integer,

  piece_id integer,

  piece_name varchar,

  start_date varchar,

  end_date varchar

);



create table piece_actors (

  company_id integer,

  piece_id integer,

  p_actor_name varchar

);



create table actors (

  company_id integer,

  actor_name varchar,

  actor_bio varchar

);



create view company_w as

select * from

  company

  left outer join pieces // dimension 1

    on company.company_id = pieces.company_id

  left outer join piece_actors // nesting in dimension 1

    on piece.company_id = piece_actors.company_id

    and piece.piece_id = piece_actors.piece_id

  left outer join actors // dimension 2

    on company.company_id = actors.company_id

;
Listing 5. Construction of the nested records by joining the flat tables.

This glances over the naming of the fields in the flat “select *” but if the field names used in the join conditions are named the same in each use, and the rest of the field names are named uniquely, the combination is straightforward and easy to understand.

Having created the join result in Listing 5, the records can be packed into the nested form according to the rules in [Report].

Since the join was done by two dimensions (one went towards actors, another towards pieces and piece_actors), the packing has to be done along two dimensions as well. [Report] discusses only the one-dimensional packing. However the same principle can be easily applied to pack multiple dimensions.

The [Report] cannot tackle more than one dimension because in this situation it can’t decide on the grouping of the fields.

But looking at the nested data as the join results, the dimensionality of the nested data directly translates to the dimensionality of a join, and the other way around, which is a known relational feature. Thus the join provides the guidance to how to do the grouping. And obviously the different subsets of fields have to be grouped separately, according to the join condition that brought them from their flat record. Figure 1 shows the diagram of the grouping.

Fields of company
+-- Fields of pieces (dimension 1)
|     +-- Fields of piece_actor
+-- Fields of actors (dimension 2)
Figure 1. A nested record as constructed from the flat records by a join.

Note that the purely relational results of the join would have been hugely redundant and next to impossible to process because both dimensions are 1:M joins, and even more so because the dimension by pieces goes two levels deep. But having been packed into a nested data structure, they are compact and easily manageable. The nested records provide a convenient and compact representation for the results of the joins of this kind.

The particular branches of the nested structures can be chosen for traversing if the query makes reference only to these branches. For example, for the query shown in Listing 6, there is obviously no need to traverse down the dimension by pieces (shows in Figure 1 as “dimension 1”), only the dimension by actors (“dimension 2”) is needed.


select company_name, actor_name, actor_bio

from company_w;

Listing 6. An example of a query from the joined view.

In the terms of the nested data structures from Listing 4, the same query from Listing 6 will become as shown in Listing 7. Or the whole packed sub-record on actors can be selected by an even simpler query shown in Listing 8.

select name, actors.name, actors.bio

from company;

Listing 7. An example of a query from a nested data structure.

select name, actors

from company;
Listing 8. An example of a query selecting the whole sub-record from a nested data structure.


Optimization of other joins with the nested data

What if we want to select the bios of all the actors in a particular piece, using the cross-reference in piece_actors? Then of course the original join represented in the nested structure is of not much use. Another join would be needed, like the one shown in Listing 9.

select actors.actor_name, actors.actor_bio from

  company

  left outer join pieces

    on company.company_id = pieces.company_id

  left outer join piece_actors

    on piece.company_id = piece_actors.company_id

    and piece.piece_id = piece_actors.piece_id

  left outer join actors

    on piece_actors.company_id = actors.company_id

    and piece_actors.p_actor_name = actors.actor_name

where

  company.company_name = COMPANY1

  and piece.piece_name = PIECE1

;

Listing 9. Join using the cross-reference.

The nested structure is of not much help here by itself. An index would be needed to join the piece_actors and actors efficiently. But it can still provide the benefit of performance. To understand it, let's look again at [Dremel].

Dremel can perform the large queries fast by partitioning the data to great many machines. It can do this because each nested record is independent, and thus a very fine-grained partitioning is easily possible.  Even an aggregation can be similarly partitioned. However Dremel has a complication with the joins. Dremel can easily join a large table to a small table by partitioning the large table and copying the small table to accompany each partition to its processing machine. But it just plain can't join two big tables, because of the combinatorial complexity.

However in many cases an optimization is possible. Suppose that we have the information about a huge number of the theatrical companies in a table with the nested record, and this table is big. Since the nesting cannot be used directly, the selection of actor biographies by piece would require a self-join shown in Listing 10. It's a join of a large table to another large table, so Dremel could not handle it.


select c2.actors from

  company c1

  join company c2

    on c1.name = c2.name

    and c1.piece.actor_name = c2.actors.name

where

  c1.name = COMPANY1

  and c1.piece.name = PIECE1

;

Listing 10. Join on nested records using the cross-reference.

But now let's look again at the nested records as representations of the results of a join in Listing 5. Note that the original join company_w already guarantees that the information about actors in a piece and the information about actors both belong to the same company. So when we look for the actors we don't need to look for them in the whole table. We need to look for them only in the same nested record. This brings back the locality and easy partitioning. The query can be written in the pseudo-code form shown in Listing 11. The “scope self join” there means that not the table company is joined with itself but one nested record from the table company is joined with the same nested record.

select c2.actors from

  company scope self join with aliases (c1, c2)

    on c1.piece.actor_name = c2.actors.name

where

  c1.name = COMPANY1

  and c1.piece.name = PIECE1

;

Listing 11. Join on nested records using the cross-reference and the data locality.

It can even be made more efficient by using indexes. Dremel is a columnar DBMS, so it keeps the indexes global and can't help this situation much. But a normal record-based DBMS can keep a separate index in each nested record and use it for these self-joins.

When the nested records are not too large, this record-local index doesn’t have to be kept on disk, instead it can be built in memory when the record is loaded into memory: the actors dimension can be iterated to build the index in memory, and then the piece_actors dimension can be iterated to build the join using that index. Of course, if the records are very small then the indexes would not provide any advantage.

Even without the massive partitioning, the self-joins within the same nested record remove the need to search through the whole table and thus greatly improve the locality of access. The self-joins could also be performed within a sub-hierarchy of a nested record.

Optimization of the aggregations with the nested records

The aggregations can also be often conveniently limited to a nested record or its sub-hierarchy. It's equivalent to the key of that record or hierarchy being used as an aggregation key or its part. But it allows an extra optimization since all the data is already grouped locally

This condition guarantees that the nested record or its sub-hierarchy translates to one or more aggregation groups, and no other record contributes to these groups. So there is no need to keep the group state through the whole query, the group can be aggregated and its result produced by iteration through one nested record, after which the state of that group can be purged.

Related technologies

The MS SQL already contains the APPLY clause that uses a similar but only partial approach, which serves as a validation of the usefulness of the concept. The results of an APPLY can potentially be fed directly into the NEST operator as described in [Nested].

However APPLY has a limitation, very much the same as the limitation of the nested model described in [Nested]: the SELECT with APPLY produces the flat relational records, and thus can efficiently handle only one APPLY per SELECT (unless all the APPLYs but one are 1:1). Having multiple APPLY clauses with 1:M relation would cause a combinatorial explosion of every possible combination. In some cases this might be the desired result but probably not typically.

On the other hand, a nested record with multiple dimensions can represent this kind of result set in a compact and efficient form. And then this compact result can be either transmitted out or used for some following operations. For example, if aiming for a massively parallel processing, the first step can group the relevant data into the nested records that contain all the relevant information, and the second step can farm out these records to thousands of machines processing them without the need for the external references.

The importance of nested records for the conventional RDBMS

To reiterate, this is not a proposal for a way of doing ORM that translates from objects to flat tables, and is not in any way related to ORM. It's not about the Xpath/XSLT kind of querying either. It’s a proposal of a way to extend the relational principles to the nested data structures, and thus make the nested data structures the first-class citizen in an RDBMS.

As described in the previous section, a nested record with multiple dimensions can represent the result set of a multi-dimensional join in a compact and efficient form.

To see why the grouping is important for the massively parallel processing, it’s useful to look again at Dremel's limitation in processing of the joins: it can join two tables only if one of them is small. The reason for that is that handling a query on a large table without a join can be parallelized easily: if you have 1000 machines, you can split the source table into 1000 partitions and have each machine work on its partition in parallel, then combine the results. But if this table gets joined to another table, each of these 1000 machines must have access to the whole second table. That's easy if the second table is small (the access to it from 1000 machines can then be efficiently cached) but if the second table is big then the access becomes much more expensive. However if the data happens to have some special locality properties where the mutually relevant data from both tables can be combined into a single message then these messages can again be efficiently spread out to 1000 machines.

Thus having the nested records treated as first-class citizen in the relational databases can make this kind of operations easier, letting the relational databases include the advantages of the hierarchical databases. "First-class" means that not only the tables would be able to contain the nested records but the SELECTs should be able to produce the nested records as results.

It would preserve all the relational logic unchanged, since on the logical level the nested records would be expanded into the flat classic form, operations performed, and the results packed back into the nested form. Of course, the actual execution would avoid doing the expensive intermediate steps of expansion directly, instead they can be optimized out, producing the same result in a more efficient way.

Adapting a conventional RDMBS for nested records

As shown above, taking advantage of the nested records in a common record-oriented RDBMS looks fairly straightforward. The following summarizes the changes:

  1. Support for the nested record format. It must be supported not only in the input and tables but also in the output, in the query results and in the intermediate values. This includes the creation of the nested record types on the fly.
  2. Support for the record-local indexes, to be stored along with the records and to be used in the in-record self-joins.
  3. The full-table indexes should probably locate the data with the precision of a nested record, with the further search done by the in-record index.
  4. Syntactic support in the query language for the scope-limited self joins (in a nested record, or in its sub-node).
  5. The scope-limited joins don't have to be self-joins with the same nested record on both sides. In some cases it might be useful to find the second nested record and then perform the cross-product of the fields A from record 1 and fields B from the record 2.
  6. Each node of a nested record can be seen as one side of join in the formation of the nested-record-as-a-join-result. Being able to refer to all the sub-nodes growing from that node would be highly useful when writing a join. Thus it would be highly useful to generate a pseudo-field similar to rowid for each node, thus making it a unique key of this node.

From the data storage standpoint, there probably is not much use in trying to modify the nested records in place. In reality the nested records tend to be of the more reasonable size, so when a nested record is updated, the whole old record can be replaced with a new one. The same approach can be applied to the record locking, locking the whole nested record. If the nested records grow over a certain limit, they can be stored in fragments by sub-trees.

Another possible approach is the automatic disassembly of the nested records into their flat “underlying tables” by reversing the join. Then they can be restored back by repeating the join.

Yet another approach may create the nested records automatically from the flat tables by joining them and storing the results as nested. This operation is similar to the creation of the materialized views, only with a different storage format. This can improve the efficiency of the data access in certain cases. The original tables may then be kept or become virtual, since their data can always be extracted back from the nested format.

Conclusion

The relational database systems can take a great advantage of the nested records by treating them as join results, as well as of producing the results of the common joins in the nested format, and of “un-joining” the nested records in querying. This support should be reasonably easy to add to the existing RDBMS with the reasonably small extensions.

References

[Dremel] http://research.google.com/pubs/pub37217.html  Dremel: Interactive Analysis
of Web-Scale Datasets
[NoGood] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.89.2966&rep=rep1&type=pdf  The Nested Relational Data Model is Not a Good Idea
[Report] http://www.cs.indiana.edu/cgi-bin/techreports/TRNNN.cgi?trnum=TR259 A Recursive Algebra for Nested Relations