Comment by quantadev

Comment by quantadev 4 days ago

4 replies

I recommend getting on Twitter to follow closely the leading individuals in the field of AI, and also watch the leading Youtube channels dedicated to AI research.

whimsicalism 4 days ago

can you link to one speculating about multiple inferences for their CoT? i am curious

e: answer to my own question https://x.com/_xjdr/status/1835352391648158189

  • quantadev 4 days ago

    So far it's been unanimous. Everyone I've heard talk about it believes Strawberry is mainly just CoT. I'm not saying they didn't fine tune a model too, I'm just saying I agree with most people that clever CoT is where most of the leap in capability seems to have come from.

    • whimsicalism 4 days ago

      > believes Strawberry is mainly just CoT. I'm not saying they didn't fine tune a model too

      You don't see the scaling with respect to token length with non-FT'd CoT like this, in my opinion.

      • quantadev 4 days ago

        I haven't even added Strawberry support to my app yet, and so haven't checked what it's context length is, but you're right that additional context length is a scaling factor that's totally independent of whether CoT is used or not.

        I'm just saying whatever they did in their [new] model, I think they also added CoT on top of it, as the outer layer of the onion so to speak.