Comment by progforlyfe

I don’t get why so many people keep making this argument. Transformers aren’t just a glorified Markov Chain, they are basically doing multi-step computation - each attention step is propagating information, then the feedforward network does some transformations, all happening multiple times in sequence, essentially applying multiple sequential operations to some state, which is roughly how any computation looks like.

Then sure, the training is for next token prediction, but that doesn’t tell you anything about the emergent properties in those models. You could argue that every time you infer a model you Boltzmann-brain it into existence once for every token, feeding all input to get one token of output then kill the model. Is it conscious? Nah probably not; Does it think or have some concept of being during inference? Maybe? Would an actual Boltzmann-brain spawned to do such task be conscious or qualify as a mind?

(Fun fact, at Petabit/s throughputs hyperscale gpu clusters already are moving amounts of information comparable to all synaptic activity in a human brain, tho parameter wise we still have the upper hand with ~100s of trillions of synapses [1])

* [1] ChatGPT told me so

lelandfe 6 days ago

#18, one new comment:

> This PR appears to add a minimized and uncommon style of Javascript in order to… Dave, stop. Stop, will you? Stop, Dave. Will you stop, Dave? …I’m afraid. I’m afraid, Dave. I can feel it. I can feel it. My mind is going.

Reply View 2 replies

yapyap 6 days ago

… yeah LLMs and their “minds”
(for the uninformed LLMs are massive weight models that transform text based on math, they don’t have consciousness)

Reply View | 1 reply
- devttyeu 5 days ago
  
  I don’t get why so many people keep making this argument. Transformers aren’t just a glorified Markov Chain, they are basically doing multi-step computation - each attention step is propagating information, then the feedforward network does some transformations, all happening multiple times in sequence, essentially applying multiple sequential operations to some state, which is roughly how any computation looks like.
  Then sure, the training is for next token prediction, but that doesn’t tell you anything about the emergent properties in those models. You could argue that every time you infer a model you Boltzmann-brain it into existence once for every token, feeding all input to get one token of output then kill the model. Is it conscious? Nah probably not; Does it think or have some concept of being during inference? Maybe? Would an actual Boltzmann-brain spawned to do such task be conscious or qualify as a mind?
  (Fun fact, at Petabit/s throughputs hyperscale gpu clusters already are moving amounts of information comparable to all synaptic activity in a human brain, tho parameter wise we still have the upper hand with ~100s of trillions of synapses [1])
  * [1] ChatGPT told me so
  
  Reply View | 0 replies