Comment by michelsedgh

Comment by michelsedgh a year ago

i love seeing stuff like this, im guessing it wont be long until this method becomes the norm

This is basically CoT, so it's already the norm for a lot of benchmarks. I think the value proposition here is that it puts a nice UX around using it in a chat interface.

Reply View 2 replies

ehsanu1 a year ago

That was my initial position too, but I think there is a search efficiency story here as well. CoT comes in many flavors and improves when tailored to the problem domain. If the LLM can instead figure out the right strategy to use to problem solve for a given problem, this may improve performance per compute vs discovering this at inference time.
Tailoring prompts is likely still the best way to maximize performance when you can, but in broader domains you'd work around this through strategies like asking the LLM to combine predefined reasoning modules, or creating multiple reasoning chains and merging/comparing them, explicit MCTS etc. I think those strategies will still be useful for a good while, but pieces of that search process, especially directing the search more efficiently, move to the LLMs over time as they get trained with this kind of data.

Reply View | 0 replies
Meganet a year ago

Its like saying geometry is just math. Proofs are just math.
They didn't train a model for millions from experts to just basically use CoT now. Thats a harsh simplification, probably.

Reply View | 0 replies