Comment by gojomo

Comment by gojomo a day ago

2 replies

Some might prefer the fidelity of this method's 70% savings over the lossyness of 4-bit quantization's 75%.

And, maybe the methods stack for those willing to trade both costs for the smallest representation.

svachalek a day ago

This is only a 30% savings, which is a cool technical feat but hard to see a use case for.