Comment by logicchains
Comment by logicchains 3 days ago
Plain RNNs are theoretically weaker than transformers with COT: https://arxiv.org/abs/2402.18510 .
Comment by logicchains 3 days ago
Plain RNNs are theoretically weaker than transformers with COT: https://arxiv.org/abs/2402.18510 .
The paper says transformers perform better than RNNs, which is not surprising.
However, they are both, theoretically, Turing complete computers. So they are equally expressive.