Comment by vessenes

Comment by vessenes 19 hours ago

1 reply

Thanks Jeff -- can you point me to something written up about rANS? All I find on line is turbulence modeling solutions; I presume this is not what you're referring to.

As we know, quantizations are a critical tool for local LLM runners; RAM is typically the gating factor. Are you aware of other better lossless compression of BF16 weights out there?

The reason I ask is this Dfloat11 seems relatively easy to plug in to existing quantization workflows, but you seem dismissive of the paper -- I presume it's my gap in understanding, and I'd like to understand.

zorgmonkey 19 hours ago

I don't know of any great write-ups unfortunately, but the rANS you're looking for is range asymmetric numeral systems.