Wednesday, February 19, 2020

Scheduling policy for burst multithreading

Over a year ago I've been reading Eric Raymond's blog post on the limitations of multithreading: http://esr.ibiblio.org/?p=8223

That gave me an idea that on the desktop we routinely have the spare CPUs going to waste. If we could reduce the cost of cross-CPU interaction, that would allow to schedule efficiently even the small snippets of parallel code. And we can reduce this cost by scheduling multiple threads of one process together, even if they have nothing to do - then they would just park the CPU in user space, and wake up quickly, directly in the user space, with already pre-heated cache and TLB. And then we can use the fast low-overhead inter-thread communications directly in the user space.

And then I've realized that another way to look at it is kind of like superscalar execution on a higher level, sort of what they've tried to do in Itanium but better.

Well, this finally ended up in a defense publication:
https://www.tdcommons.org/dpubs_series/2896/

Not sure if it will end up in anything useful but at least the idea is out there and you can read the details of it.