Comment by logicchains

Comment by logicchains a year ago

Plain RNNs are theoretically weaker than transformers with COT: https://arxiv.org/abs/2402.18510 .

The paper says transformers perform better than RNNs, which is not surprising.

However, they are both, theoretically, Turing complete computers. So they are equally expressive.