Comment by storystarling

Comment by storystarling 5 days ago

1 reply

Curious about the memory bandwidth constraints here. 20B parameters at 20fps seems like it would saturate the bandwidth of a single GPU unless you are running int4. I assume this requires an H100?

andrew-w 5 days ago

Yep, the model is running on Hopper architecture. Anything less was not sufficient in our experiments.