Monday, December 27, 2021

a book on Perl

 I've been browsing at a used books store, and bought the book "Higher order Perl" by Mark-Jason Dominus.  It's been published in 2003 but is still pretty amazing. It's about doing things that people normally associate with languages like Haskell, but in Perl. But it doesn't stop there. The book goes on to show an implementation of a parser infrastructure somewhat like ANTLR (but as you can imagine, with a lot less code), which is pretty mind-boggling, and then goes on to use it for a declarative drawing system that solves linear equations to determine the positions of the elements as described in a domain-oriented language.

The book can now be downloaded for free: https://hop.perl.plover.com/#free from the web site.

Saturday, December 18, 2021

sentiments

 It's pretty amazing how the development of AI research is now intertwining with psychology. Not really surprising, since if the current ML models emulate the human thinking, this is what should be happening, but the amazing part is that the development of the AI research has finally reached this point.

The IEEE Computing Edge magazine tends to be rather boring but usually contains one gem per issue. In the November 2021 issue that gem was the history of Ikonas graphics processing (I didn't realize that the raster video started to appear only in mid-1970s, before then the memory costs were prohibitively expensive, and the displays did vector graphics in hardware). In the December 2021 issue that gem is an article about the human emotions "The hourglass model revisited" (https://www.sentic.net/hourglass-model-revisited.pdf, 10.1109/MIS.2020.2992799).

They define the human emotions as a combination of 6 dimensions, 4 of them that can be positive or negative (hence forming a "hourglass", thin in the middle, growing towards extremes): sensitivity, attitude, temper, introspection, and 2 neutral that can be present or not present: expectation and surprise. Additionally they split the attitude into "towards self" and towards others".

The gradations of 4 "main" emotions from positive to negative are defined there as:

Introspection: ecstasy-joy-contentment-melancholy-sadness-grief

Temper: bliss-calmness-serenity-annoyance-anger-rage

Attitude: delight-pleasantness-acceptance-dislike-disgust-loathing

Sensitivity: enthusiasm-eagerness-responsiveess-anxiety-fear-terror

And then they define the compounds, for example "hate =  anger + fear", or "pessimism = expectation + sadness". Though strangely they define despair in the same way as pessimism and not as "expectation + grief", I guess this model has quite a bit of subjectivity.

They also define the absolute strength of an emotion (in range of -1 to +1) as a sum of strengths of its components (or we could say "of its present components") divided by the count of its present (i.e. non-0) components. 

This is an improvement on always dividing by 4, but I wonder if it could be improved by just taking a sum of highest negative and highest positive components. Their formula already shifts the weights in the same direction compared to the previous one, and arguably presence of a little bit of a side emotion would not reduce the effect of the strongest one, and maybe even add to it if both emotions are in the same direction. So maybe it should be for each direction not even

  max(abs(Ei))

but

  1-Product(1-abs(Ei))

Then I've remembered where I've seen the same problem: in the Bayesian deduction! Just rescale every sentiment from [-1, 1] to [0, 1] and consider it to be a Bayesian probability of an evidence that the sentiment is positive. And then just compose them in the Bayesian deduction, in a classic formula or with formula by odds/chances as described in https://babkin-cep.blogspot.com/2017/02/a-better-explanation-of-bayes-adaboost.html. Then the sentiments that are at 0 would naturally translate to 0.5 and won't affect the computation.

Thursday, December 2, 2021

retraining the neural networks

IEEE Spectrum had printed an issue on AI, and there they've described a way to teach the old neural networks the new tricks without making them forget the old tricks: by marking the more important connections as immutable when training the network on a data set for the new kind of classification.

Which looks like a generalization of the technique of "emergents" (see https://babkin-cep.blogspot.com/2017/06/emergents.html): there everything below the top 2 layers is declared immutable and the top 2 layers are re-trained for the new classification from scratch. The idea being that the first training had taught the network, what details are important, and then re-training assembles the new classifier from these details.

But the generalization goes farther: it can look at the weights of the connections, and if the weight is close to 0 (on the range of [-1, 1]), this can't be an important connection, and is a candidate to re-train, making the model learn the new details too. They say in the article that the old classifier degrades somewhat after this re-training but that's not surprising: the set of the "emergents" has changed during the re-training while the old classifier section is still expecting the old set. It would make sense to fixate the network below the top 2 layers (the "new emergents") and do another training of the old part with that.  The newly learned emergents might even be useful to improve the classification of the previously learned categories. They actually talk in the article about doing a periodic re-training with the previously trained subsets of data but not quite in the same context.