Monday, April 23, 2012

LookupJoin key by pattern

While working on the table-to-table joins, I've been also retrofitting some of the features to the LookupJoin. So, here are two changes, both in the department of the join condition:

First, now it checks that the fields in the right side of the option "by" are actually all matching the keys fields in right-side index.

Second, now the join condition may also be specified in a pattern form, the same as the result. For example the condition from the previous examples

by => [ "acctSrc" => "source", "acctXtrId" => "external" ],

can be also specified as:

byLeft => [ "acctSrc/source", "acctXtrId/external" ],

The option name "byLeft" says that the pattern specification is for the fields on the left side (there is no symmetric "byRight"). The substitutions produce the matching field names for the right side. Unlike the result pattern, here the fields that do not find a match do not get included in the key. It's as if an implicit "!.*" gets added at the end. In fact, "!.*" really does get added implicitly at the end.

Of course, for this example either option doesn't make much difference. It starts making the difference when the key fields follow a pattern. For example, if the key fields on both sides have the names "acctSrc" and "acctXtrId", the specification with the new option becomes a little simpler:

byLeft => [ "acctSrc", "acctXtrId" ],

Even more so if the key is long, common on both sides, and all the fields have a common prefix. For example:

k_AccountSystem
k_AccountId
k_InstrumentSystem
k_InstrumentId
k_TransactionDate
k_SettlementDate

Then the key can be specified simply as

byLeft => [ "k_.*" ],

If the settlement date doesn't matter, it can be excluded:

byLeft => [ "!k_SettlementDate", "k_.*" ],

If the right side represents a swap of securities, it might have two parts to it, each describing its half with its key:

BorrowAccountSystem
BorrowAccountId
BorrowInstrumentSystem
BorrowInstrumentId
BorrowTransactionDate
BorrowSettlementDate
LoanAccountSystem
LoanAccountId
LoanInstrumentSystem
LoanInstrumentId
LoanTransactionDate
LoanSettlementDate

Then the join with its borrow part can be done with

byLeft => [ 'k_(.*)/Borrow$1' ],

Makes the long keys easier to drag around.

No comments:

Post a Comment