Comment by ilaksh

Just thinking about this.. do you know if anyone has figured out a way to reliably encode a Turing machine or simple virtual machine in the layers of a neural network, in a somewhat optimal way, using a minimized number of parameters?

Or maybe fully integrating differentiable programming into networks. It just seems like you want to keep everything in matrices in the AI hardware to get the really high efficiency gains. But even without that, I would not complain about an article that Marcus wrote about something along those lines.

But the one you showed has interesting ideas but lacked substance to me and doesn't seem up to date.