Comment by wslh
My 2 cents: Since LLMs (Large Language Models) operate as at least a subset of Turing machines (which recognize recursively enumerable languages), the chain of thought (CoT) approach could be equivalent to or even more expressive than that subset. In fact, CoT could perfectly be a Turing machine.
If we leave CoT aside for a moment, it's worth exploring the work discussed in the paper "Neural Networks and the Chomsky Hierarchy"[1], which analyzes how neural networks (including LLMs) map onto different levels of the Chomsky hierarchy, with a particular focus on their ability to recognize formal languages across varying complexity.
> In fact, CoT could perfectly be a Turing machine.
Are we going to need an infinite number of LLMs, arranged on a tape?