Saturday, May 19, 2012

The field processing helpers

The manipulation on the field lists from the joins is also available for reuse. It's grouped in the package Triceps::Fields.

The function

$result = &Triceps::Fields::isArrayType($simpleType);

checks whether the simple type is an array type, from the standpoint of representation it in Perl. The array types are those that end with "[]", except "uint8[]" (because it's represented as a Perl scalar string).

The function

@fields = &Triceps::Fields:: filter($callerDescr, \@incomingFields, \@patterns);

filters a list of fields by a pattern list of the same form as used in the join results.  For example:

my @leftfld = $self->{leftRowType}->getFieldNames();
my @res =&Triceps::Fields::filter(
  "Triceps::LookupJoin::new: option '$leftFields'",
  \@leftfld, $self->{"$leftFields"});

$callerDescr is the description of the caller used in the error messages, \@incomingFields is the reference to an array with the field names to be filtered, \@patterns is the reference to array of field name patterns. Each pattern is a string in one of the forms:

  • 'regexp' - pass through the field names matching the anchored regexp (i.e. implicitly wrapped as '^regexp$').
  • '!regexp' - throw away the field names matching the anchored regexp.
  • 'regexp/regsub' - pass through the field names matching the anchored regexp, performing a substitution on it.

If a field name doesn't match any of the patterns, it doesn't pass through the filter.

Each field is checked against each pattern in order, and the first successful match determines what happens with the field. For example, when the pattern ['!key', '.*'] is used on the field name "key", the first '!key' matches it and blocks the field from passing through the filter.

In general, quoting the patterns with single quotes is better than with double quotes, because this way the special regexp characters don't need so much escaping with backslashes. Naturally, it's better to keep the field names alphanumeric too, to avoid getting funny effects when they are used in the patterns. Some particularly useful pattern examples:

  • '.*' - pass through everything
  • '.*/second_$&/' - pass everything and prepend 'second_' to it
  • 'right_(.*)/$1/' - pass the field names starting from 'right_' and remove this prefix from them

More examples of the patterns have been shown with the joins.

The result is an array of field names after translation. That array has the same size as @incomingFields, and keeps the passed-through fields in the same order. The fields that don't pass through get replaced with undef.

If a pattern specifies a literal alphanumeric field name without any regexp wildcard (such as 'key' or '!key' or 'key/right_$&/'), this function makes sure that the field is present in the original field list. If it isn't, the function confesses. The reason is for the accidental typos in the field names not to go unnoticed. No such check is done on the general patterns, to allow the reuse of patterns on many different field lists, including those where the pattern doesn't match anything.

The function doesn't check for any duplicates in the resulting field names, nor for any funny characters in them. The reason for not checking the duplicates is that often the result is combined from multiple sets of filtered fields, and the check for duplicates makes sense only after these sets are combined.

The function

@pairs = &Triceps::Fields::filterToPairs($callerDescr,
  \@incomingFields, \@patterns);

is a version of filter() that does the same but returns the result in a different form. This time the result contains a pair of values "oldName", "newName" for each field that passes through (of course, if the field is not renamed, "oldName" and "newName" will be the same). For example, if called

@pairs = &Triceps::Fields::filterToPairs("MyTemplate result",
  ["abc", "abcd", "fgh", "jkl"], ["!abc", "a.*", "jkl/qwe"]);

the result in @pairs will be: ("abcd", "abcd", "jkl", "qwe"). The field "abcd" made through as is, the field "jkl" got renamed to "qwe". You can also put the result of filterToPairs directly into a map:

%resultFields = &Triceps::Fields::filterToPairs(...);

Other than the result format, filterToPairs() works exactly the same as filter(). It's just that sometimes one format of the result is more convenient, sometimes the other.

No comments:

Post a Comment