Monday, April 16, 2012

Manual iteration with LookupJoin

Sometimes you might want to just get the list of the resulting rows from LookupJoin and iterate over them by yourself, rather than have it call the labels. To be honest, this looked kind of important when I wrote LookupJoin first, but by now I don't see a whole lot of use in it. By now, if you want to do a manual iteration, calling findBy() and then iterating looks like a more useful option. But at the time there was no findBy(), and this feature came to exist. Here is an example:

our $join = Triceps::LookupJoin->new(
  unit => $uJoin,
  name => "join",
  leftRowType => $rtInTrans,
  rightTable => $tAccounts,  rightIdxPath => ["lookupSrcExt"],
  rightFields => [ "internal/acct" ],
  by => [ "acctSrc" => "source", "acctXtrId" => "external" ],
  automatic => 0,
); # would die by itself on an error

# label to print the changes to the detailed stats
my $lbPrintPackets = makePrintLabel("lbPrintPackets", $join->getOutputLabel());

while(<STDIN>) {
  chomp;
  my @data = split(/,/); # starts with a command, then string opcode
  my $type = shift @data;
  if ($type eq "acct") {
    $uJoin->makeArrayCall($tAccounts->getInputLabel(), @data)
      or die "$!";
  } elsif ($type eq "trans") {
    my $op = shift @data; # drop the opcode field
    my $trans = $rtInTrans->makeRowArray(@data) or die "$!";
    my @rows = $join->lookup($trans);
    foreach my $r (@rows) {
      $uJoin->call($lbPrintPackets->makeRowop($op, $r)) or die "$!";
    }
  }
  $uJoin->drainFrame(); # just in case, for completeness
}

It copies the first LookupJoin example, only now manually. Once the option "automatic" is set to 0 for the join, the method $join->lookup() becomes available to perform the lookup and return the result rows in an array (the data sent to the input label keeps working as usual, sending the resutl rows to the output label). This involves the extra overhead of keeping all the result rows (and there might be lots of them) in an array, so by default the join is compiled in an automatic-only mode.

Since lookup() knows nothing about the opcodes, those had to be sent separately around the lookup. 

The result is the same as for the first example, only the name of the result label differs:

acct,OP_INSERT,source1,999,1
acct,OP_INSERT,source1,2011,2
acct,OP_INSERT,source2,ABCD,1
trans,OP_INSERT,1,source1,999,100
lbPrintPackets OP_INSERT id="1" acctSrc="source1" acctXtrId="999"
 amount="100" acct="1" 
trans,OP_INSERT,2,source2,ABCD,200
lbPrintPackets OP_INSERT id="2" acctSrc="source2" acctXtrId="ABCD"
 amount="200" acct="1" 
trans,OP_INSERT,3,source2,QWERTY,200
lbPrintPackets OP_INSERT id="3" acctSrc="source2" acctXtrId="QWERTY"
 amount="200" 
acct,OP_INSERT,source2,QWERTY,2
trans,OP_DELETE,3,source2,QWERTY,200
lbPrintPackets OP_DELETE id="3" acctSrc="source2" acctXtrId="QWERTY"
 amount="200" acct="2" 
acct,OP_DELETE,source1,999,1

No comments:

Post a Comment