Comment by quantadev
So far it's been unanimous. Everyone I've heard talk about it believes Strawberry is mainly just CoT. I'm not saying they didn't fine tune a model too, I'm just saying I agree with most people that clever CoT is where most of the leap in capability seems to have come from.
> believes Strawberry is mainly just CoT. I'm not saying they didn't fine tune a model too
You don't see the scaling with respect to token length with non-FT'd CoT like this, in my opinion.