Comment by ertgbnm
But the model doesn't have an internal state, it just has the tokens, which means it must encode it's reasoning into the output tokens. So it is a reasonable take to think that CoT was them showing their work.
But the model doesn't have an internal state, it just has the tokens, which means it must encode it's reasoning into the output tokens. So it is a reasonable take to think that CoT was them showing their work.