Comment by pton_xd
I was under the impression that CoT works because spitting out more tokens = more context = more compute used to "think." Using CoT as a way for LLMs "show their working" never seemed logical, to me. It's just extra synthetic context.
Humans sometimes draw a diagram to help them think about some problem they are trying to solve. The paper contains nothing that the brain didn't already know. However, it is often an effective technique.
Part of that is to keep the most salient details front and center, and part of it is that the brain isn't fully connected, which allows (in this case) the visual system to use its processing abilities to work on a problem from a different angle than keeping all the information in the conceptual domain.