Comment by marcosdumay

Comment by marcosdumay 3 days ago

0 replies

What I don't get is... didn't people prove that in the 90s for any multi-layer neural network? Didn't people prove transformers are equivalent on the transformers paper?