Comment by thorum
Comment by thorum 5 days ago
o1’s innovation is not Chain-of-Thought. It’s teaching the model to do CoT well (from massive amounts of human feedback) instead of just pretending to. You’ll never get o1 performance just from prompt engineering.
> from massive amounts of human feedback
It might be the 200M user base of OpenAI that provided the necessary guidance for advanced CoT, implicitly. Every user chat session is also an opportunity for the model to get feedback and elicit experience from the user.