Comment by Oras

Comment by Oras 4 days ago

1 reply

Well, with Tree Of Thought (ToT) and fine-tuned models, I'm sure you can achieve the same performance with margin to improve as you identify the bottlenecks.

I'm not convinced OpenAI is using one model. Look at the thinking process (UI), which takes time, and then suddenly, you have the output streamed out at high speed.

But even so, people are after results, not really the underlying technology. There is no difference of doing it with one model vs multiple models.

alach11 4 days ago

> I'm not convinced OpenAI is using one model. Look at the thinking process (UI), which takes time, and then suddenly, you have the output streamed out at high speed.

According to OpenAI, the model does it's thinking behind the scenes, then at the end summarizes that thinking for the user. We don't get to see the original chain-of-thought reasoning, just the AI's own summary of that reasoning. That explains the output timing.