Comment by badmonster
Comment by badmonster a day ago
What stands out most is the practical implication: enabling lossless inference of a 405B-parameter model on a single node with 8×80GB GPUs is wild. That’s a huge unlock for research labs and startups alike that want to run frontier models without massive infrastructure costs.
> That’s a huge unlock for research labs and startups alike that want to run frontier models without massive infrastructure costs.
Or let one of the neoclouds take care of the infrastructure costs and rent it out from them. Disclosure: I run one of them.