Comment by mlyle
Comment by mlyle 6 months ago
> you're trusting software developers either way, whether it be at the app level, the language/runtime level, or the operating system level.
I trust systems to do better based on observed behavior rather than a software engineer's guess of how it will be scheduled. Who knows if, in a given use case, the program is a "small" part of the system or a "large" part that should get preferential placement and scheduling.
> If a threadpool manager is hinted that 4 threads are going to share a lot of memory, they can be allocated on the same l2 cache.
And so this is kind of a weird thing: we know we're going to be performance critical and we need things to be forced to be adjacent... but we don't know the exact details of the hardware we're running on. (Else, just numa_bind and be done...)
The beauty is that you don't care what hardware you run on, all you're annotating are very useful but generic properties such as which threads are sharing a lot of memory, or perhaps that a thread should have highest performance priority so that internally it stays on p cores instead of the more scalable e cores. Very simple optional hints.