Comment by orbital-decay
Comment by orbital-decay 4 days ago
Scaling like what? Are there any comparisons with and without CoT, or with other models with their CoT? As far as I'm aware, their CoT part is secret. I'm sure the finetuning does some lifting, but I'm also sure the difference in a fair comparison won't be remotely as significant as it's being hyped currently.
This is still clearly CoT, with all its limitations and caveats as expected. That's an improvement, sure, but definitely not a qualitative leap like OAI is trying to present it. (in a really shady manner)
CoT performance of any other SOTA quickly plateaus as tokens grow. It does not have anywhere near the same graph as o1's test time compute plot.
Saying it's just CoT is kind of meaningless. Even just looking at the examples on Open AI's blog and you quickly see no other model today can generate or utilize CoT to anywhere near that quality through prompting or naive fine-tuning.