Comment by TacticalCoder

Comment by TacticalCoder 2 days ago

3 replies

As someone who used to eat assembly instructions for breakfast back in the days and remembering when a MUL was taking more than 1 cycle, is there any resource you'd recommend to learn about using the highly vectorized/parallelized instruction sets in modern CPUs?

I know about Daniel Lemire / lemire.me

Anybody / anything else you'd recommend?

dzaima 2 days ago

There's https://www.agner.org/optimize/microarchitecture.pdf for a bunch of microarchitectural information, covering many interesting things outside of specific instruction breakdown as in uops.info. chipsandcheese.com also covers a bunch of stuff. Then there's various more specific things in a bunch of places, e.g. http://www.numberworld.org/blogs/2024_8_7_zen5_avx512_teardo..., https://web.archive.org/web/20240602004718/https://www.merse..., and a bunch more. https://dougallj.github.io/applecpu/firestorm.html for Apple M1.