Comment by danielmarkbruce

Comment by danielmarkbruce a day ago

2 replies

Yes, yes.

Nvidia about to release blackwell ultra with 288GB. Go back to maybe 2018 and max was 16gb if memory serves.

DeepSeek recently release a 670 gb model. A couple years ago Falcon's 180gb seemed huge.

spoaceman7777 a day ago

I'd assume that, in the context of LLM inference, "recent" generally refers to the Ampere generation and later of GPUs, when the demand for on board memory went through the roof (as, the first truly usable LLMs were trained on A100s).

We've been stuck with the same general caps on standard GPU memory since then though. Perhaps limited in part because of the generational upgrades happening in the bandwidth of the memory, rather than the capacity.

  • danielmarkbruce a day ago

    Bandwidth is going up too. "It's not doubling every 18 months and hence it's not moving" isn't a sensible way to view change.

    A one time effective 30% reduction in model size simply isn't going to be some massive unlocker, in theory or in practice.