Friday, April 11, 2014

callBound

I've found that I've missed documenting yet another way to call a streaming function in Perl, the method Unit::callBound().

$unit->callBound($rowop_or_tray, $fnreturn => $fnbinding, ...);
$unit->callBound([@rowops], $fnreturn => $fnbinding, ...);

It's an encapsulation of a streaming function call, a great method if you have all the rowops for the call available upfront.. The first argument is a rowop or a tray or a reference to an array of rowops (but the trays are not allowed in the array). The rest are the pairs of FnReturns and FnBindings. The bindings are pushed onto the FnReturns, then the rowops are called, then the bindings are popped. It replaces a whole block that would contain an AutoFnBind and the calls:

{
  my $ab = Triceps::AutoFnBind->new(
    $fnreturn => $fnbinding, ...
  );
  $unit->call($rowop_or_tray);
}

Only callBound() does its work in C++, so it's more efficient than a Perl block, and it's shorter to write too.

Sunday, March 23, 2014

debugging

I've recently attended a training on debugging on Windows (pretty good one, by Wintellect, if you're interested). And all along I've been impressed how the features of WinDbg are similar to the features I wrote back at Aleri into the debugger for the Aleri CEP virtual machine. (Okay, gdb is probably similar too but I've never learned it in depth, I've never been a big fan of debuggers). I guess the fundamentals of debugging are the same, no matter what is the platform. Only I also wrote into it a way to go backwards and backtrace after hitting a breakpoint, what led to the current situation.

Braced

I've found that the Braced package was undocumented. First of all, I've added the proper unit tests for it, and thus renamed it from Triceps::X::Braced to Triceps::Braced (the X namespace is for the packages with limited testing). I've also renamed the methods, replacing the word "quote" with "escape", to make their meaning clearer.  And here goes the documentation section that I wrote for the upcoming manual:


The package Braced is designed to parse the Tcl-like nested lists where the elements are separated by whitespace, and braces are used to enquote the elements with spaces in them. These lists are used to write the pipelines that form the Tql queries. For example:

{read table tWindow} {project fields {symbol price}} {print tokenized 0}

These lists can then be parsed into elements, and the elements might be also lists that could be parsed into elements and so on. The spaces between the braces are optional, braces also serve as separators. For example, the following lines are equivalent:

a b c
{a} {b} {c}
{a}{b}{c}
{a}b{c}

In case if a brace character needs to be included into one of the strings, they can be escaped by backslashes, for example:

{a\{} b\}c

Any other Perl backslash escapes, such as \n or \x20, work too. The quote characters have no special meaning, they don't need to be escaped and they don't group the words. For example, the following two are equivalent:

"a b c"
{"a} {b} {c"}

Escaping the spaces (\ ) provides another way to combine the words into one element. The following two are equivalent:

{a b c}
a\ b\ c

 There is no need for the nested escaping. The characters need to be escaped only once, and then the resulting strings can be wrapped into any number of brace levels.

All the methods in this module are static, there are no objects.

$string = $data;
@elements = Triceps::Braced::raw_split_braced($string)
confess "Unbalanced braces around '$string'" if $string;

Split the string into the braced elements. If any of the elements were enclosed into their own braces, these braces are left in place, the element string will still contain them. For example, a {b} {c d} will be split into a, {b}, {c d}. No unescaping is done, the escaped characters are passed through as-is. This method of splitting is rarely used, it's present as a baseline.

The original string argument will be fully consumed. If anything is left unconsumed, this is an indication of a syntax error, with unbalanced braces. The argument may not be a constant because it gets modified.

$string = $data;
@elements = Triceps::Braced::split_braced($string)
confess "Unbalanced braces around '$string'" if $string;

Split the string into the braced elements. If any of the elements were enclosed into their own braces, these braces will be removed from the results. For example, a {b} {c d} will be split into a, b, c d. No unescaping is done, the escaped characters are passed through as-is. This is the normal method of splitting, it allows the elements to be split further recursively.

The original string argument will be fully consumed. If anything is left unconsumed, this is an indication of a syntax error, with unbalanced braces. The argument may not be a constant because it gets modified.

$result = Triceps::Braced::bunescape($string);

Un-escape a string by processing all the escape characters in it. This step is normally done last, after all the splitting is done. The result will become unsuitable for the future splitting because the escaped characters will lose their special meaning. If any literal braces are present in the argument, they will pass through to the result as literals. For example, {a \{b } will become {a {b }.

@results = Triceps::Braced::bunescape_all(@strings);

Perform the un-escaping on a whole array of strings. The result array will contain the same number of elements as the argument.

$ref_results = Triceps::Braced::split_braced_final($string);
confess "Unbalanced braces around '$string'" if $string;

The combined functionality of splitting a string and un-escaping the result elements. That's why it's final: no further splits must be done after un-escaping. The return value is different from the other split methods. It is a reference to the array of result strings. The difference has been introduced to propagate the undef from the argument to the result: if the argument string is undef, the result will be also undef, not a reference to an empty array. The string gets consumed in the same way as for the other split methods, and anything left in it indicates an unbalanced brace.

Thursday, March 6, 2014

automatic views for the normalized databases

I'm still working on the editing of the docs for 2.0. In the meantime, I want to bullshit on a general subject.

When designing a database schema, there is always this fine line between normalization and denormalization. Denormalize too much and you get the same data in lots of places, with the possibility of them getting dissynchronized. Normalize too much and every query becomes a join of a dozen tables. I've had to deal with a highly normalized database, and I must tell you, it's been quite a pain.

Well, views to the rescue, right? But the views need to be defined, and then their capacity of inserts and updates are quite limited. Let's see if it can be done better.

When the foreign keys are explicitly defined, they can be used to implicitly define a view. For example, suppose that we have a set of tables:

TABLE Employee (
  EmpId INTEGER PRIMARY KEY,
  EmpFirstName VARCHAR,
  EmpLastName VARCHAR,
  EmpStartDate DATE,
  EmpSalary FLOAT
);

TABLE Department (
  DeptId INTEGER PRIMARY KEY,
  DeptName VARCHAR PRIMARY KEY
);

TABLE Title (
  TitleId INTEGER PRIMARY KEY,
  TitleName VARCHAR PRIMARY KEY,
  TitleLevel INTEGER
);

TABLE Position (
  EmpId INTEGER FOREIGN KEY (Employee.EmpId),
  DeptId INTEGER FOREIGN KEY (Department.DeptId),
  TitleId INTEGER FOREIGN KEY (Title.TitleId)
);

The knowledge of the foreign keys allows the DBMS to create implicitly a view

CREATE VIEW Position_EXP
AS SELECT *
FROM Position, Employee, Department, Title
WHERE Position.EmpId = Employee.EmpId
  AND Position.DeptId = Department.DeptId
  AND Position.TitleId = Title.TitleId;

Okay, I've made things a bit easier by making sure that the fields names don't overlap but in general nothing stops the DBMS from doing this automatically, adding prefixes to the field names if needed. The queries like

SELECT EmpStartDate
FROM Position_EXP
WHERE DeptName='Marketing' AND TitleLevel > 5;

become much more compact with this view.

Better yet, it's pretty straightforward to use this view for updates. For example,

UPDATE Position_EXP
SET TitleName = 'Engineer 2'
WHERE TitleName = 'Engineer 1' AND EmpStartDate > 2010-10-10;

What it translates to is: first find the TitleId for TitleName 'Engineer 2' then set it into the Position rows that match the conditions. It's not something a normal RDBMS allows to do with a view, and normally this would require two separate SQL statements, but the translation is fairly straightforward. And there is no reason why RDBMS can't translate it automatically.

We can do inserts too. For example:

INSERT INTO Employee_EXP, Position_EXP
VALUES (
  EmpId = EmpIdSequence(),
  EmpFirstName = 'John,
  EmpLastName = 'Doe',
  EmpStartDate = TODAY(),
  EmpSalary = 123456,
  DeptName = 'Engineering',
  TitleName = 'Senior Engineer'
);

Some values would go directly into the Employee table, some into the Position table, and some would be looked up from the helper tables to create the binding by ids. I've listed two table names in INSERT to give an explicit hint to what tables under the views are getting inserts, as opposed to just being used for look-ups, since I think it would be too dangerous to insert things that weren't meant to be inserted.

The statement is small and short, just as for the denormalized tables (maybe even shorter) but handles all the underlying normalization. And again, it's pretty straightforward to deduce the meaning of the statement automatically in the SQL parser.

Saturday, January 18, 2014

performance variations

I've had an opportunity to run the performance tests on a few more laptops. Al of the same Core2 generation, but with the different CPU frequencies. The 2GHz version showed expectedly an about 10% lower performance, except for the inter-thread communication through the nexus: it went up. On a 3GHz CPU all the performance went up about proportionally. So I'm not sure, what was up with the 2.2GHz CPU, maybe the timing worked out just wrong to add more overhead.

Here are the result from a 3GHz CPU:

Performance test, 100000 iterations, real time.
Empty Perl loop 0.006801 s, 14702927.05 per second.
Empty Perl function of 5 args 0.031918 s, 3133046.99 per second.
Empty Perl function of 10 args 0.027418 s, 3647284.30 per second.
Row creation from array and destruction 0.277232 s, 360708.19 per second.
Row creation from hash and destruction 0.498189 s, 200727.04 per second.
Rowop creation and destruction 0.155996 s, 641043.69 per second.
Calling a dummy label 0.098480 s, 1015437.21 per second.
Calling a chained dummy label 0.110546 s, 904601.83 per second.
  Pure chained call 0.012066 s, 8287664.25 per second.
Calling a Perl label 0.512559 s, 195099.61 per second.
Row handle creation and destruction 0.195778 s, 510781.66 per second.
Repeated table insert (single hashed idx, direct) 0.772934 s, 129377.12 per second.
Repeated table insert (single hashed idx, direct & Perl construct) 1.109781 s, 90107.89 per second.
  RowHandle creation overhead in Perl 0.336847 s, 296871.05 per second.
Repeated table insert (single sorted idx, direct) 2.122350 s, 47117.59 per second.
Repeated table insert (single hashed idx, call) 0.867846 s, 115227.88 per second.
Table insert makeRowArray (single hashed idx, direct) 1.224588 s, 81660.14 per second.
  Excluding makeRowArray 0.947355 s, 105557.02 per second.
Table insert makeRowArray (double hashed idx, direct) 1.443053 s, 69297.51 per second.
  Excluding makeRowArray 1.165821 s, 85776.47 per second.
  Overhead of second index 0.218466 s, 457738.04 per second.
Table insert makeRowArray (single sorted idx, direct) 29.880962 s, 3346.61 per second.
  Excluding makeRowArray 29.603730 s, 3377.95 per second.
Table lookup (single hashed idx) 0.287407 s, 347938.44 per second.
Table lookup (single sorted idx) 9.160540 s, 10916.39 per second.
Lookup join (single hashed idx) 3.940388 s, 25378.21 per second.
Nexus pass (1 row/flush) 0.618648 s, 161642.86 per second.
Nexus pass (10 rows/flush) 2.417818 s, 413596.01 per second.


I've added a few more tests: the table look-ups, and the passing of rows through the nexus with 10 rows per batch. With the batching, the inter-thread communications work quite decently fast.

Sunday, January 5, 2014

added a little missing method

When editing the docs, I've noticed an incompleteness in AggregatorGadget, so I've added the method that was missing:

const IndexType *getIndexType() const;

Not that there was any use for it (and the subclasses could just read the field directly), but still.

Friday, December 20, 2013

PowerShell

First, a status update: I've finally resumed the work on the docs for 2.0, albeit it's proceeding slowly yet.

I've been reading recently on PowerShell. A pretty cool tool. I've tried it before and it didn't work well for me then because I didn't understand its purpose. It's not a normal OS shell. Instead, it's the shell for the .NET virtual machine. Exactly the thing that Java is missing, and the gap that it tries to plug with the crap like Ant and Maven, unsuccessfully. PowerShell lets you run all the .NET methods interactively from the command line, and build the pipelines of them. It has some very cool syntax that lets you automatically apply the pipeline input in the same way as the command-line input. It also has the remote execution functionality, so it serves as an analog of the rsh/ssh (more advanced in some ways, less advanced in the others) in the Microsoft ecosystem.

But here is the CEP-related part: The Triceps TQL is not unique in handling the SQL-like queries in the form of pipelines. PowerShell does that too. I guess, it's a fairly obvious idea. You can even treat PowerShell as a rudimentary CEP system, and write the processing in the form of CEP pipelines. There is a major limitation that the pipelines are all linear, with no forking and joining, but on the other hand the pipelines can be used to pass the arbitrarily complex objects, and also a mix of objects of different types, so with some creativity the more complex topologies can be simulated (still no loops though).

Another catch with the pipelines is very similar to a current limitation of TQL: each stage of the pipeline works all by itself. In a SQL statement the query optimizer can turn a WHERE clause into an iteration by a small subset of an index. In TQL and PowerShell there will be an iteration on everything, followed by filtering in a WHERE. The grand plan for TQL is to add a query optimizer to it eventually, that could combine multiple sequential stages of the pipeline into one optimized stage. The other, simpler, alternative that I was considering for the short term is to specify the optimized selection manually as an option to the command that reads the table contents. PowerShell takes that one in many cases, so that say the command that pulls the data from a SqlServer database gets a full SQL query as an argument and does its filtering on the server. But I guess theoretically nothing really stops them from doing some pipeline optimization by pulling the WHERE conditions from a "where" command into the SQL statement for the database selection command. If would save the trouble of the SQL and "where" having different syntax.

Sunday, November 10, 2013

Triceps performance

I've finally got interested enough in Triceps performance to write a little test, Perf.t. By default it runs only one thousand iterations, to be fast and not delay the run of the full tests suite. But the number can be increased by setting an environment variable, like:

$ TRICEPS_PERF_COUNT=100000 perl t/Perf.t

An important caveat, the test is of the Perl interface, so it includes all the overhead of constructing the Perl objects. I've tried to structure it so that some of the underlying performance can be deduced, but it's still approximate. I haven't done the performance testing of just the underlying C++ implementation yet, it will be better.

Here are the numbers I've got on my 6-year old laptop (dual-CPU Intel Core2 T7600 2.33GHz) with explanations. The time in seconds for each value is for the whole test loop. The "per second" number shows, how many loop iterations were done per second.

The computations are done with the real elapsed time, so if the machine is not idle, the time of the other processes will get still counted against the tests, and the results will show slower than they really are.

Performance test, 1000 iterations, real time.

The first thing it prints is the iteration count, to set the expectations for the run length and precision.

Empty Perl loop 0.000083 s, 11983725.71 per second. 

A calibration to see, how much overhead is added by the execution of the loop itself. As it turns out, not much.

Row creation from array and destruction 0.003744 s, 267085.07 per second. 

The makeRowArray() for a row of 5 fields. Each created row gets destroyed before the next one gets created.

Row creation from hash and destruction 0.006420 s, 155771.52 per second.

The makeRowHash() for a row of 5 fields.


Rowop creation and destruction 0.002067 s, 483716.30 per second.

The makeRowop() from an existing row. Same thing, each rowop gets destroyed before constructing the next one.

Calling a dummy label 0.001358 s, 736488.85 per second.

Repeated calls of a dummy label with the same rowop object.

Calling a chained dummy label 0.001525 s, 655872.40 per second.
  Pure chained call 0.000167 s, 5991862.86 per second.



Repeated calls of a dummy label that has another dummy label chained to it. The "pure" part is the difference from the previous case that gets added by adding another chained dummy label.


Calling a Perl label 0.006669 s, 149946.52 per second.

Repeated calls of a Perl label with the same rowop object. The Perl label has an empty sub but that empty sub still gets executed, along with all the support functionality.

Row handle creation and destruction 0.002603 s, 384234.52 per second.

The creation of a table's row handle from a single row, including the creation of the Perl wrapper for the row handle object.

Repeated table insert (single hashed idx, direct) 0.010403 s, 96126.88 per second.

Insert of the same row into a table. Since the row is the same, it keeps replacing the previous one, and the table size stays at 1 row. Even though the row is the same, a new row handle gets constructed for it every time by the table, the code is $tSingleHashed->insert($row1). "Single hashed idx" means that the table has a single Hashed index, on an int32 field. "Direct" means the direct insert() call, as opposed to using the table's input label.

Repeated table insert (single hashed idx, direct & Perl construct) 0.014809 s, 67524.82 per second.
  RowHandle creation overhead in Perl 0.004406 s, 226939.94 per second.


The same, only the row handles are constructed in Perl before inserting them: $tSingleHashed->insert($tSingleHashed->makeRowHandle($row1)). And the second line shows that the overhead of wrapping the row handles for Perl is pretty noticeable (it's the difference from the previous test case).

Repeated table insert (single sorted idx, direct) 0.028623 s, 34937.39 per second.

The same thing, only for a table that uses a Sorted index that executes a Perl comparison on the same int32 field. As you can see, it gets 3 times slower.

Repeated table insert (single hashed idx, call) 0.011656 s, 85795.90 per second.

The same thing, again the table with a single Hashed index, but this time by sending the rowops to its input label.

Table insert makeRowArray (single hashed idx, direct) 0.015910 s, 62852.02 per second.
  Excluding makeRowArray 0.012166 s, 82194.52 per second.


Now the different rows get inserted into the table, each row having a different key. At the end of this test the table contains 1000 rows (or however many were requested by the environment variable). Naturally, this is slower than the repeated insertions of the same row, since the tree of the table's index becomes deeper and requires more comparisons and rebalancing. This performance will be lower in the tests with more rows, since the index will become deeper and will create more overhead. Since the rows are all different, they are created on the fly, so this row creation overhead needs to be excluded to get the actual Table's performance.

Table insert makeRowArray (double hashed idx, direct) 0.017231 s, 58033.37 per second.
  Excluding makeRowArray 0.013487 s, 74143.61 per second.
  Overhead of second index 0.001321 s, 756957.95 per second.


Similar to previous but on a table that has two Hashed indexes (both on the same int32 field). The details here compute also the overhead contributed by the second index.

Table insert makeRowArray (single sorted idx, direct) 0.226725 s, 4410.64 per second.
  Excluding makeRowArray 0.222980 s, 4484.70 per second.


Similar but for a table with a Sorted index with a Perl expression. As you can see, it's about 20 times slower (and it gets even worse for the larger row sets).

Nexus pass 0.034009 s, 29403.79 per second.

The performance of passing the rows between threads through a Nexus. This is a highly pessimistic case, with only one row per nexus transaction. The time also includes the draining and stopping of the app.

And here are the numbers for a run with 100 thousand iterations, for comparison:

Performance test, 100000 iterations, real time.
Empty Perl loop 0.008354 s, 11970045.66 per second.
Row creation from array and destruction 0.386317 s, 258854.76 per second.
Row creation from hash and destruction 0.640852 s, 156042.16 per second.
Rowop creation and destruction 0.198766 s, 503105.38 per second.
Calling a dummy label 0.130124 s, 768497.20 per second.
Calling a chained dummy label 0.147262 s, 679062.46 per second.
  Pure chained call 0.017138 s, 5835066.29 per second.
Calling a Perl label 0.652551 s, 153244.80 per second.
Row handle creation and destruction 0.252007 s, 396813.99 per second.
Repeated table insert (single hashed idx, direct) 1.053321 s, 94937.81 per second.
Repeated table insert (single hashed idx, direct & Perl construct) 1.465050 s, 68257.07 per second.
  RowHandle creation overhead in Perl 0.411729 s, 242878.43 per second.
Repeated table insert (single sorted idx, direct) 2.797103 s, 35751.28 per second.
Repeated table insert (single hashed idx, call) 1.161150 s, 86121.54 per second.
Table insert makeRowArray (single hashed idx, direct) 1.747032 s, 57239.94 per second.
  Excluding makeRowArray 1.360715 s, 73490.78 per second.
Table insert makeRowArray (double hashed idx, direct) 2.046829 s, 48856.07 per second.
  Excluding makeRowArray 1.660511 s, 60222.41 per second.
  Overhead of second index 0.299797 s, 333559.51 per second.
Table insert makeRowArray (single sorted idx, direct) 38.355396 s, 2607.20 per second.
  Excluding makeRowArray 37.969079 s, 2633.72 per second.
Nexus pass 1.076210 s, 92918.63 per second.

As you can see, the table insert performance got worse due to the added depth of the index trees while the nexus performance got better because the drain overhead got spread over a larger number of rows.

Wednesday, November 6, 2013

redundancy

Here is another entry on the general things.

Fairly often we want things to be redundant: run two or more copies of a system, and if one of them fails, another one picks up after it. Two is the minimal number of the instances, and it's a pretty unstable one: besides a chance of one instance failing, there is also a chance of the network partitioning. In the case of the network partitioning you don't want two copies to start messing with the data independently, instead you really want to still keep only one copy as the master and shut down the other one. But the failure and partitioning are pretty hard to tell apart, which makes the 2-instance configurations quite unstable.

Even if you have a 3-instance configuration (a good idea overall), you're still not immune. If one instance goes down for the scheduled maintenance, you're back to the 2-instance situation.

How can the situation be made more stable?

One obvious but expensive option is to just create more instances. Not only it's expensive but it also adds its own problems. Suppose, there are 4 instances to start with, and one instance finds that two other instances went down. Does it mean that these two instances just died (possibly from some common reason) or that a network partitioning had occurred?

An improvement on that would be create the extra instances not as the full ones but just as "beacons", only responding for the purpose of watching the partitioning and not actually running the code. Then the load on these beacon instances will be low, and they can be combined with machines running some other systems. Or a dedicated machine can serve as a beacon for many systems in the company. Well, once you ask "what if a beacon goes down", this starts growing a bit into a system of redundant beacons and their own problems of partitioning. And it could actually get stupid with both instances seeing the beacon but not seeing each other, but that can be resolved by the communication through the beacon.

But a typical situation is a pipeline of systems. Each system is connected to one or more source systems and/or one or more sink systems (sou it could be not only a straight pipeline but technically a tree). And each system in the pipeline would be duplicated or triplicated and each instance cross-connected to all the duplicates of each source and sink. In this situation the master node of  a source can be used as such a beacon! This would make a 2-instance configuration still stable.

Thursday, October 10, 2013

unusual aggregations

In the meantime, while I'm busy with the other things, I want to write up some short notes for the future.

In the world of monitoring there is a concept of time series: the sequence of values observed at certain moments of time. And then the common thing is to subtract the previous value from each one. For example, if the total number of events processed by a server is by time stamps:

0 20 30 50

then by subtracting the previous value we get the number of events processed during each time period:

20 10 30

(naturally, there is one less value, since, the first one in the original series has nothing to subtract).

This can really be thought of as an aggregation. A highly unusual one, producing not just one result row per group but many rows per group, but still a variety of aggregation.

The other similar  example is with the timestamps. Suppose, we have a multi-stage process, and a timestamped record of when each stage started:

11:30 Stage1
11:45 Stage2
12:10 Stage3
12:20 End

This record can be converted into the length of each stage by subtracting the next time value from the starting timestamp:

Stage1 0:15
Stage2 0:25
Stage3 0:10

And the other way around, if we have the starting time and the record of the length of each stage, we can convert them back to the absolute timestamps.

Monday, September 9, 2013

status update

I've started on editing the manual for version 2.0 and then got distracted by the Real Life. Things should quiet down a bit in a couple more months and I'll get back to making the official release 2.0.

Thursday, August 1, 2013

NameSet update

While editing the docs, I've realized that the NameSet class being a single-threaded Starget while it's used inside the multithreaded Mtarget IndexType, is not a good thing. So I've upgraded it to an Mtarget as well.

The other related item is the method IndexType::getKey(). There is no good reason to return an Autoref<NameSet> from it, a pointer is just as good and even better. The new prototype for it is:

virtual const NameSet *getKey() const;

Saturday, July 27, 2013

string utilities

As I'm editing the documentation for 2.0, I've found an omission: the string helper functions in the C++ API haven't been documented yet. Some of them have been mentioned but not officially documented.

The first two are declared in common/Strprintf.h:

string strprintf(const char *fmt, ...);
string vstrprintf(const char *fmt, va_list ap);

They are entirely similar to sprintf() and vsprintf() with the difference that they place the result of formatting into a newly constructed string and return that string.

The rest are defined in common/StringUtil.h:

extern const string &NOINDENT;

The special constant that when passed to the printing of the Type, causes it to print without line breaks. Doesn't have any special effect on Errors, there it's simply treated as an empty string.

const string &nextindent(const string &indent, const string &subindent, string &target);

Compute the indentation for the next level when printing a Type. The arguments are:

indent - indent string of the current level
subindent - characters to append for the next indent level
target - buffer to store the extended indent string

The passing of target as an argument allows to reuse the same string object and avoid the extra construction.


The function returns the computed reference: if indent was NOINDENT, then reference to NOINDENT, otherwise reference to target. This particular calling pattern is strongly tied to how things are computed inside the type printing, but you're welcome to look inside it and do the same for any other purpose.

void newlineTo(string &res, const string &indent);

Another helper function for the printing of Type, inserting a line break. The indent argument specifies the indentation, with the special handling of NOINDENT: if indent is NOINDENT, a single space is added, thus printing everything in one line; otherwise a \n and the contents of indent are added. The res argument is the result string, where the line break characters are added.

void hexdump(string &dest, const void *bytes, size_t n, const char *indent = "");

Print a hex dump of a sequence of bytes (at address bytes and of length n), appending the dump to the destination string dest. The data will be nicely broken into lines, with 16 bytes printed per line. The first line is added directly to the end of the dest as-is,  but if n is over 16, the other lines will follow after \n. The indent argument allows to add indentation at the start of each following string.

void hexdump(FILE *dest, const void *bytes, size_t n, const char *indent = "");

Another version, sending the dumped data directly into a file descriptor.

The next pair of functions provides a generic mechanism for converting enums between a string and integer representation:

struct Valname
{
    int val_;
    const char *name_;
};

int string2enum(const Valname *reft, const char *name);
const char *enum2string(const Valname *reft, int val, const char *def = "???");

The reference table is defined with an n array of Valnames, with the last element being { -1, NULL }. Then it's passed as the argument reft of the conversion functions which do a sequential look-up by that table. If the argument is not found, string2enum() will return -1, and enum2string() will return the value of the def argument (which may be NULL).

Here is an example of how it's used for the conversion of opcode flags:

Valname opcodeFlags[] = {
    { Rowop::OCF_INSERT, "OCF_INSERT" },
    { Rowop::OCF_DELETE, "OCF_DELETE" },
    { -1, NULL }
};

const char *Rowop::ocfString(int flag, const char *def)
{
    return enum2string(opcodeFlags, flag, def);
}

int Rowop::stringOcf(const char *flag)
{
    return string2enum(opcodeFlags, flag);
}

Friday, July 19, 2013

GCC warning flags

The GCC warning flags have been an annoying issue: in general, I've been building Triceps with warning-treated-as-errors, except for a few of them disabled for uselessness. However this presents a problem with the varying versions of GCC: the older versions don't have some of the warning flags I use, so they fail, and the new ones have more warnings of the useless variety that fail in a different way (and there are some non-useless, and they occasionally help detecting more weird stuff but that's not the point).

The solution I've come up, is to use the different warning flags for the build checked out from SVN on the trunk branch, and for all the other builds. All these warning flags are now enabled only if the path of the Triceps directory ends in "trunk". Otherwise they are skipped. If you check out the code directly from SVN, you still need to worry about the warning flags but not notherwise.

Thursday, July 18, 2013

Perl 5.19 and SIGUSR2

I've tested Triceps with Perl version 5.19. This required fixing some expected error messages that have changed, and now the patterns accept both the old and new error messages.

But the worst part is that the Perl 5.19 was crashing on SIGUSR2. If you're interested in the details, see https://rt.perl.org/rt3//Public/Bug/Display.html?id=118929. I've worked around this issue by overriding the Perl's signal handler for SIGUSR2  in the XS code.

The method is Triceps::sigusr2_setup(), and it gets called during the Triceps module loading. Internally it translates to the C++ method Sigusr2::setup() that sets the dummy handler on the first call.

This has a consequence that you can't set a real SIGUSR2 handler in Perl any more. But it stops Perl from crashing, and there probably isn't much reason to do a custom handler of SIGUSR2 anyway.

Sunday, July 14, 2013

BasicPthread reference (C++)

As you can see from the previous descriptions, building a new Triead is a serious business, containing many moving part. Doing it every time from scratch would be hugely annoying and error prone. The class BasicPthread, defined in app/BasicPthread.h,  takes care of wrapping all that complicated logic.

It originated as a subclass of pw::pwthread, and even though it ended up easier to copy and modify the code (okay, maybe this means that pwthread can be made more flexible), the usage is still very similar to it. You define a new subclass of BasicPthread, and define the virtual function execute() in it. Then you instantiate the object and call the method start() with the App argument.

For a very simple example:

class MainLoopPthread : public BasicPthread
{
public:
    MainLoopPthread(const string &name):
        BasicPthread(name)
    {
    }

    // overrides BasicPthread::start
    virtual void execute(TrieadOwner *to)
    {
        to->readyReady();
        to->mainLoop();
    }
};

...

Autoref<MainLoopPthread> pt3 = new MainLoopPthread("t3");
pt3->start(myapp);

It will properly create the Triead, TrieadOwner, register the thread joiner and start the execution. The TrieadOwner will pass through to the execute() method, and its field fi_ will contain the reference to the FileInterrupt object. After execute() returns, it will take care of marking the thread as dead.

It also wraps the call of execute() into a try/catch block, so any Exceptions thrown will be caught and cause the App to abort. In short, it's very similar to the Triead management in Perl.

You don't need to keep the reference to the thread object afterwards, you can even do the construction and start in one go:

(new MainLoopPthread("t3"))->start(myapp);

 The internals of BasicPthread will make sure that the object will be dereferenced (and thus, in the absence of other references, destroyed) after the thread gets joined by the harvester.

Of course, if you need to pass more arguments to the thread, you can define them as fields in your subclass, set them in the constructor (or by other means between constructing the object and calling start()), and then execute() can access them. Remember, execute() is a method, so it receives not only the TrieadObject as an argument but also the BasicPthread object as this.

BasicPthread is implemented as a subclass of TrieadJoin, and thus is an Mtarget. It provides the concrete implementation of the joiner's virtual methods, join() and interrupt(). Interrupt() calls the method of the base class, then sends the signal SIGUSR2 to the target thread.

And finally the actual reference:

BasicPthread(const string &name);

Constructor. The name of the thread is passed through to App::makeTriead(). The Triead will be constructed in start(), the BasicPthread constructor just collects together the arguments.

void start(Autoref<App> app);

Construct the Triead, create the POSIX thread, and start the execution there.

void start(Autoref<TrieadOwner> to);

Similar to the other version of start() but uses a pre-constructed TrieadOwner object. This version is useful mostly for the tests, and should not be used much in the real life.

virtual void execute(TrieadOwner *to);

Method that must be redefined by the subclass, containing the threads's logic.

virtual void join();
virtual void interrupt();

Methods inherited from TrieadJoin, providing the proper implementations for the POSIX threads.

And unless, I've missed something, this concludes the description of the Triceps threads API.

Saturday, July 13, 2013

FileInterrupt reference (C++)

FileInterrupt is the class that keeps track of a bunch of file descriptor and revokes them on demand, hopefully interrupting any ongoing operations on them (and if that doesn't do the job, a separately sent signal will). It's not visible in Perl, being intergrated into the TrieadOwner methods, but in C++ it's a separate class. It's defined in app/FileInterrupt.h, and is an Mtarget, since the descriptors are registered and revoked from different threads.

FileInterrupt();

The constructor, absolutely plain. Normally you would not want to construct it directly but use the object already constructed in TrieadJoin. The object keeps the state, whether the interruption had happened, and is obviously initialized to the non-interrupted state.

void trackFd(int fd);

Add a file descriptor to the tracked interruptable set. If the interruption was already done, the descriptor will instead be revoked right away by dupping over from /dev/null. If the attempt to open /dev/null fails, it will throw an Exception.

void forgetFd(int fd);

Remove a file descriptor to the tracked interruptable set. You must do it before closing the descriptor, or a race leading to the corruption of random file descriptors may occur. If this file descriptor was not registered, the call will be silently ignored.

 void interrupt();

Perform the revocation of all the registered file descriptors by dupping over them from /dev/null. If the attempt to open /dev/null fails, it will throw an Exception.

This marks the FileInterrupt object state as interrupted, and any following trackFd() calls will lead to the immediate revocation of the file descriptors in them, thus preventing any race conditions.

bool isInterrupted() const;

Check whether this object has been interrupted.

TrieadJoin reference (C++)

TrieadJoin is the abstract base class that tells the harvester how to join a thread after it had finished. Obviously, it's present only in the C++ API and not in Perl (and by the way, the reference of the Perl classes in 2.0 has been completed, the remaining classes are only in C++).

Currently TrieadJoin has two subclasses: PerlTrieadJoin for the Perl threads and BasicPthread for the POSIX threads in C++. I won't be describing PerlTriedJoin, since it's in the internals of the Perl implementation, never intended to be directly used by the application developers, and if you're interested, you can always look at its source code. I'll describe BasicPthread later.

Well, actually there is not a whole lot of direct use for TrieadJoin either: you need to worry about it only if you want to define a joiner for some other kind of threads, and this is not very likely. But since I've started, I'll complete the write-up of it.

So, if you want to define a joiner for some other kind of  threads, you define a subclass of it, with an appropriately defined method join().

TrieadJoin is an Mtarget, naturally referenced from multiple threads (at the very least it's created in the thread to be joined or its parent, and then passed to the harvester thread by calling App::defineJoin()). The methods of TrieadJoin are:

TrieadJoin(const string &name);

The constrcutor. The name is the name of the Triead, used for the error messages. Due to the various syncronization reasons, this makes the life of the harvester much easier, than trying to look up the name from the Triead object.

virtual void join() = 0;

The Most Important joining method to be defined by the subclass. The subclass object must also hold the identity of the thread in it, to know which thread to join. The harvester will call this method.

virtual void interrupt();

The method that interrupts the target thread when it's requested to die. It's called in the context of the thread that triggers the App shutdown (or otherwise requests the target thread to die). By default the TrieadJoin carries a FileInterrupt object in it (it gets created on TrieadJoin construction, and then TrieadJoin keeps a reference to it), that will get called by this method to revoke the files. But everything else is a part of the threading system, and the base class doesn't know how to do it, the subclasses must define their own methods, wrapping the base class.

Both PerlTrieadJoin and BasicPthread add sending the signal SIGUSR2 to the target thread. For that they use the same target thread identity kept in the object as used by the join() call.

FileInterrupt *fileInterrupt() const;

Get a pointer to the FileInterrupt object defined in the TrieadJoin. The most typical use is to pass it to the TrieadOwner object, so that it can be easily found later:

to->fileInterrupt_ = fileInterrupt();

Though of course it could be kept in a separate local Autoref instead.

const string &getName() const;

Get back the name of the joiner's thread.

SIGUSR2

When a thread is requested to die, its registered file descriptors become revoked, and the signal SIGUSR2 is sent to it to interrupt any ongoing system calls. For this to work correctly, there must be a signal handler defined on SIGUSR2, because otherwise the default reaction to it is to kill the process. It doesn't matter what signal handler, just some handler must be there. The Triceps library defines an empty signal handler but you can also define your own instead.

In Perl, the empty handler for SIGUSR2 is set when the module Triceps.pm is loaded. You can change it afterwards.

In C++ Triceps provides a class Sigusr2, defined in app/Sigusr2.h, to help with this. If you use the class BasicPthread, you don't need to deal with Sigusr2 directly: BasicPthread takes care of it. All the methods of Sigusr2 are static.

static void setup();

Set up an empty handler for SIGUSR2 if it hasn't been done yet. This class has a static flag (synchronized by a mutex) showing that the handler had been set up. On the first call it sets the handler and sets the flag. On the subsequent calls it checks the flag and does nothing.

static void markDone();

Just set the flag that the setup has been done. This allows to set your own handler instead and still cooperate with the logic of Sigusr2 and BasicPthread.

If you set your custom handler before any threads have been started, then set up your handler and then call markDone(), telling Sigusr2 that there is no need to set the handler any more.

If you set your custom handler when the Triceps threads are already running (not the best idea but still a possibility), there is a possibility of a race with another thread calling setup(). To work around that race, set up your handler, call markDone(), then set up your handler again.

static void reSetup();

This allows to replace the custom handler with the empty one. It always forcibly sets the empty handler (and also the flag).

odds and ends

While working on threads support, I've added a few small features here and there. Some of them have been already described, some will be described now. I've also done a few more small clean-ups.

First, the historic methods setName() are now gone everywhere. This means Unit and Label classes, and in C++ also the Gadget. The names can now only be specified during the object construction.

FnReturn has the new method:

$res = fret->isFaceted();

bool isFaceted() const;

It returns true (or 1 in Perl) if this FnReturn object is a part of a Facet.

Unit has gained a couple of methods:

$res = $unit->isFrameEmpty();

bool isFrameEmpty() const;

Check whether the current frame is empty. This is different from the method empty() that checks whether the the whole unit is empty. This method is useful if you run multiple units in the same thread, with some potentially complicated cross-unit scheduling. It's what nextXtray() does with a multi-unit Triead, repeatedly calling drainFrame() for all the units that are found not empty. In this situation the simple empty() can not be used because the current inner frame might not be the outer frame, and draining the inner frame can be repeated forever while the outer frame will still contain rowops. The more precise check of isFrameEmpty() prevents the possibility of such endless loops.

$res = $unit->isInOuterFrame();

bool isInOuterFrame() const;

Check whether the unit's current inner frame is the same as its outer frame, which means that the unit is not in the middle of a call.


In Perl the method Rowop::printP() has gained an optional argument for the printed label name:

$text = $rop->printP();
$text = $rop->printP($lbname);

The reason for that is to make the printing of rowops in the chained labels more convenient. A chained label's execution handler receives the original unchanged rowop that refers to the first label in the chain. So when it gets printed, it will print the name of the first label in the chain, which might be very surprising. The explicit argument allows to override it to the name of the chained label (or to any other value).


 In C++ the Autoref has gained the method swap():

void swap(Autoref &other);

It swaps the values of two references without changing the reference counts in the referred values. This is a minor optimization for such a special situation. One or both references may contain NULL.


In C++ the Table has gained the support for sticky errors. The table internals contain a few places where the errors can't just throw an Exception because it will mess up the logic big time, most specifically the comparator functions for the indexes. The Triceps built-in indexes can't encounter any errors in the comparators but the user-defined ones, such as the Perl Sorted Index, can. Previously there was no way to report these errors other than print the error message and then either continue pretending that nothing happened or abort the program.

The sticky errors provide a way out of this sticky situation. When an index comparator encounters an error, it reports it as a sticky error in the table and then returns false. The table logic then unrolls like nothing happened for a while, but before returning from the user-initiated method it will find this sticky error and throw an Exception at a safe time. Obviously, the incorrect comparison means that the table enters some messed-up state, so all the further operations on the table will keep finding this sticky error and throw an Exception right away, before doing anything. The sticky error can't be unstuck. The only way out of it is to just discard the table and move on.

void setStickyError(Erref err);

Set the sticky error from a location where an exception can not be thrown, such as from the comparators in the indexes. Only the first error sticks, all the others are ignored since (a) the table will be dead and throwing this error in exceptions from this point on anyway and (b) the comparator is likely to report the same error repeatedly and there is no point in seeing multiple copies.

Errors *getStickyError() const;

Get the table's sticky error. Normally there is no point in doing this manually, but just in case.

void checkStickyError() const;

If the sticky error has been set, throw an Exception with it.