Comment by baq

Comment by baq a year ago

View on Hacker News

Now just need an autoregressive transformer <==> RNN isomorphism paper and we're golden

logicchains a year ago

Plain RNNs are theoretically weaker than transformers with COT: https://arxiv.org/abs/2402.18510 .

Reply View 1 reply

tossandthrow a year ago

The paper says transformers perform better than RNNs, which is not surprising.
However, they are both, theoretically, Turing complete computers. So they are equally expressive.

Reply View | 0 replies