Comment by slickytail

Comment by slickytail 15 days ago

2 replies

The memory bandwidth on an H100 is 3TB/s, for reference. This number is the limiting factor in the size of modern LLMs. 100GB/s isn't even in the realm of viability.

torginus 14 days ago

That bandwidth is for the whole GPU, which has 6 mermoy chips. But anyways, what I'm proposing isn't for the high-end and training, but for making inference cheap.

And I was somehat conservative with the numbers, a modern budget SSD with a single NAND can do more than 5GB/s read speed.

torginus 14 days ago

That bandwidth is for the whole GPU, which has 6 chips. But anyways, what I'm proposing isn't for the high-end and training, but for making inference cheap.

And I was somehat conservative with the numbers, a modern budget SSD with a single NAND can do more than 5GB/s read speed.