selfhoster11 19 hours ago

The full thing, 671B. It loses some intelligence at 1.5 bit quantisation, but it's acceptable. I could actually go for around 3 bits if I max out my RAM, but I haven't done that yet.

  • apitman 18 hours ago

    I've seen people say the models get more erratic at higher (lower?) quantization levels. What's your experience been?