Comment by ttul

Pasting from my Perplexity page on the topic:

The core innovation [1] of o1 lies in its ability to generate and refine internal chains of thought before producing a final output [2]. Unlike traditional LLMs that primarily focus on next-token prediction, o1 learns to:

1. Recognize and correct mistakes 2. Break down complex steps into simpler ones 3. Try alternative approaches when initial strategies fail

This process allows o1 to tackle more complex, multi-step problems, particularly in STEM fields.

OpenAI reports observing new "scaling laws" with o1 [5]:

1. Train-time compute: Performance improves with more extensive reinforcement learning during training. 2. Test-time compute: Accuracy increases when the model is allowed more time to "think" during inference.

This suggests a trade-off between inference speed and accuracy.

Sources [1] Introducing OpenAI o1 https://medium.com/%40sriramramakrishnan.aiexpert/openais-o1... [2] Learning to Reason with LLMs | OpenAI https://openai.com/index/learning-to-reason-with-llms/ [3] OpenAI o1 models - FAQ [ChatGPT Enterprise and Edu] https://help.openai.com/en/articles/9855712-openai-o1-models... [4] OpenAI releases new o1 reasoning model - The Verge https://www.theverge.com/2024/9/12/24242439/openai-o1-model-... [5] 9 things you need to know about OpenAI's powerful new AI model o1 https://fortune.com/2024/09/13/openai-o1-strawberry-model-9-... [6] Notes on OpenAI's new o1 chain-of-thought models https://simonwillison.net/2024/Sep/12/openai-o1/ [7] OpenAI just dropped o1 Model that can 'reason' through complex ... https://www.tomsguide.com/ai/openais-o1-model-takes-ai-to-a-... [8] Models - OpenAI API https://platform.openai.com/docs/models [9] OpenAI Unveils O1 - 10 Key Facts About Its Advanced AI Models https://www.forbes.com/sites/janakirammsv/2024/09/13/openai-...

bn-l 10 months ago

That answers nothing the commenter asked.

Reply View 2 replies

ttul 10 months ago

Thanks for the critique. Here is how I would answer their question myself:
o1 is far more than just CoT mechanics. It relies on a specialized model or collection of models that offer new capabilities to make CoT work far better than it works with a stock LLM.
For instance, o1 can recognize and correct its own mistakes and it seems to know how to dig deeper when needed. That's not something that stock LLMs do very well.

Reply View | 1 reply
- [removed] 10 months ago
  
  [deleted]
  
  Reply View | 0 replies