Sunday, September 27, 2020

a book on probability

I've stumbled upon the book "Introduction to probability" by Anderson, Seppalainen, and Valko. It's a college textbook. 

Right in the second chapter it does a very good introduction into the Bayesian formula, deriving it from the descriptions of weights of events, in the same way as I struggled to reinvent on my own. And right there it also provides the "General version of Bayes' formula" for the enumerated events, that took me so much time to reinvent on my own. Well, another case of "everything is invented before us" :-) If only we knew, what books to read at the right time :-) On the other hand, I'm not sure if I would have recognized the importance of that formula if I didn't have a use for it in mind first.

In case if you wonder, here is the formula: 

If B1...Bn are events that partition the sample space, then for any event A we have:

P(A) = sum from i=1 to n of (P(A & Bi)) = sum from i=1 to n of (P(A|Bi)*P(Bi))

Then 

P(Bk|A) = P(A & Bk) / P(A) = P(A|Bk)*P(Bk) / (sum from i=1 to n of (P(A|Bi)*P(Bi)))

Wednesday, September 23, 2020

cheap malloc-ed stacks

I've accidentally realized that the disjointed stacks (ones composed of the separately allocated chunks) can be pretty inexpensive if they get allocated in fixed pages, and each stack frame is guaranteed to be less than a page (which is fairly common in, say, the OS kernel). And the stack pages can be made larger than the physical pages.

All that the function needs to do at the start of the call is check that there is enough room to the end of the page. Which is easy enough by checking that (SP & PAGE_SIZE) < (PAGE_SIZE-FRAME_SIZE). Well, taking a page from the RISC book, we'd also want to reserve some room at the end of the page for the leaf functions that could then skip the whole check, but it doesn't change the general approach.

If a new page needs to be allocated, it can be taken from a pool of free stack pages (a linked list), and would be linked to the previous page, then the frame created in it (copying the old stack pointer onto the new stack, as in the "classic" Intel x86 sequence of "push BP" that saves the pointer to the previous stack frame). The next page can also be found already linked to the current page from the last time, and then the allocation can be skipped. The function return then doesn't need to do anything special, just "pop SP" and return. And then if the thread blocks, the scheduler can return the unused pages beyond the last active frame back to the pool.

There obviously is some overhead but not a real huge amount of it.