Comment by gojomo

Comment by gojomo 8 months ago

Some might prefer the fidelity of this method's 70% savings over the lossyness of 4-bit quantization's 75%.

And, maybe the methods stack for those willing to trade both costs for the smallest representation.

svachalek 8 months ago

This is only a 30% savings, which is a cool technical feat but hard to see a use case for.

iamnotagenius 8 months ago

[dead]