Comment by storystarling
Comment by storystarling 5 days ago
Curious about the memory bandwidth constraints here. 20B parameters at 20fps seems like it would saturate the bandwidth of a single GPU unless you are running int4. I assume this requires an H100?
Yep, the model is running on Hopper architecture. Anything less was not sufficient in our experiments.