Comment by Aurornis

Comment by Aurornis 7 hours ago

I experimented with the Q2 and Q4 quants. First impression is that it's amazing we can run this locally, but it's definitely not at Sonnet 4.5 level at all.

Even for my usual toy coding problems it would get simple things wrong and require some poking to get to it.

A few times it got stuck in thinking loops and I had to cancel prompts.

This was using the recommended settings from the unsloth repository. It's always possible that there are some bugs in early implementations that need to be fixed later, but so far I don't see any reason to believe this is actually a Sonnet 4.5 level model.

margalabargala 6 hours ago

Wonder where it falls on the Sonnet 3.7/4.0/4.5 continuum.

3.7 was not all that great. 4 was decent for specific things, especially self contained stuff like tests, but couldn't do a good job with more complex work. 4.5 is now excellent at many things.

If it's around the perf of 3.7, that's interesting but not amazing. If it's around 4, that's useful.

Reply View 1 reply

Computer0 an hour ago

I still have yet to find a "Small" model that can use function calls consistently enough to not be frustrating. That is the most noticeable difference I consistently see between even older "SOTA" models and the best performing "SMALL" models (<70b).

Reply View | 0 replies

Kostic 6 hours ago

I would not go below q8 if comparing to sonnet.

Reply View 0 replies

cubefox 6 hours ago

> I experimented with the Q2 and Q4 quants.

Of course you get degraded performance with this.

Reply View 1 reply

Aurornis 5 hours ago

Obviously. That's why I led with that statement.
Those are the quant thresholds where people with mid-high end hardware can run this locally at reasonable speed, though.
In my experience Q2 is flakey, but Q4 isn't dramatically worse.

Reply View | 0 replies