Comment by kccqzy

Comment by kccqzy a day ago

0 replies

What they mean is that they are rewriting low level synchronization primitives in order not to penalize AMD CPUs. For example on the AMD Rome CPUs, the cross-CCD latency of atomic instructions could be as high as 200 nanoseconds even when the instructions supposedly access a memory location already in the cache. Common code patterns like multiple cores atomically incrementing a single counter would have borderline acceptable performance on Intel but terrible performance on AMD.

Or consider things like CPU core allocators, which now need to be CCD-aware when allocating cores within a CPU to a container.