Comment by charles_irl
Comment by charles_irl 4 days ago
Yeah, there's a tension between showing enough information to be useful for driving decisions and hiding enough information.
For example, "compute capability" sounds like it'd be what you need, but it's actually more of a software versioning index :(
Was thinking of splitting the difference by collecting up the quoted arithmetic (FLOP/s) and memory bandwidths from the manufacturer datasheets. But there's caveats there too, e.g. the dreaded "With sparsity" asterisk on the Tensor Core FLOP/s of recent generations.
I was looking for a simple table recently- outlining say how the shared memory or total register size/SM varies between generations (Something like that Wiki table). It was surprisingly hard to find those info.