Comment by quantadev

Comment by quantadev 10 months ago

So far it's been unanimous. Everyone I've heard talk about it believes Strawberry is mainly just CoT. I'm not saying they didn't fine tune a model too, I'm just saying I agree with most people that clever CoT is where most of the leap in capability seems to have come from.

whimsicalism 10 months ago

> believes Strawberry is mainly just CoT. I'm not saying they didn't fine tune a model too

You don't see the scaling with respect to token length with non-FT'd CoT like this, in my opinion.

Reply View 1 reply

quantadev 10 months ago

I haven't even added Strawberry support to my app yet, and so haven't checked what it's context length is, but you're right that additional context length is a scaling factor that's totally independent of whether CoT is used or not.
I'm just saying whatever they did in their [new] model, I think they also added CoT on top of it, as the outer layer of the onion so to speak.

Reply View | 0 replies