Comment by petermcneeley
Comment by petermcneeley 6 days ago
Great work. Nice aesthetic.
"These groups of threads, known as warps , are switched out on a per clock cycle basis — roughly one nanosecond. CPU thread context switches, on the other hand, take few hundred to a few thousand clock cycles"
I would note that intels SMT does do something very similar (2 hw threads). Other like the xeon phi would round robin 4 threads on a single core.
SMT isn't that really is it?
SMT allows for concurrent execution of both threads (thus independent front-end for fetch, decode especially) and certain core resources are statically partitioned unlike a warp being scheduled on SM.
I'm not a graphics expert but warps seem closer to run-time/dynamic VLIW than SMT.