HN Top New Show Ask Jobs

settings

Theme

Hand Mode

Feed

Comment by baq

Comment by baq a year ago

2 replies

View on Hacker News

Now just need an autoregressive transformer <==> RNN isomorphism paper and we're golden

logicchains a year ago

Plain RNNs are theoretically weaker than transformers with COT: https://arxiv.org/abs/2402.18510 .

Reply View | 1 reply
  • tossandthrow a year ago

    The paper says transformers perform better than RNNs, which is not surprising.

    However, they are both, theoretically, Turing complete computers. So they are equally expressive.

    Reply View | 0 replies