Comment by shawntan

> The actual result of the paper is that any poly-time computable function can be computed with poly-many tokens.

You're right.

Re: NAND of two inputs. Isn't this doable even by a single layer (no hidden layers) neural network?

Re: Polynomial computable function. I'm assuming this makes no assumption of constant-depth.

Because my entire point was that the result of this paper is not actually impressive AND covered by a previous paper. Hopefully that's clearer.