Comment by talkingtab

Comment by talkingtab 2 days ago

4 replies

The topic is of great interest to me, but the approach throws me off. If we have learned one thing from AI, it is the primal difference between knowing about something and being able to do something. [With extreme gentleness, we humans call it hallucination when an AI demonstrates this failing.]

The question I increasingly pose to myself and others, is which kind of knowledge is at hand here? And in particular, can I use this to actually build something?

If one attempted to build a conscious machine, the very first question I would ask, is what does conscious mean? I reason about myself so that means I am conscious, correct? But that reasoning is not a singularity. It is a fairly large number of neurons collaborating. An interesting question - for another tine - is then is whether a singular entity can in fact be conscious? But we do know that complex adaptive systems can be conscious because we are.

So step 1 in building a conscious machine could be to look at some examples of constructed complex adaptive systems. I know of one, which is the RIP routing protocol (now extinct? RIP?). I would bet my _money_ that one could find other examples of artificial CAS pretty easily.

[NOTE: My tolerance for AI style "knowledge" is lower and lower every day. I realize that as a result this may come off as snarky and apologize. There are some possibly good ideas for building conscious machines in the article, but I could not find them. I cannot find the answer to a builders question "how would I use this", but perhaps that is just a flaw in me.]

visarga a day ago

> My tolerance for AI style "knowledge" is lower and lower every day.

We like to think humans possess genuine knowledge while AI only learns patterns. But in reality do we learn medicine before going to the doctor? or do we engage the process in an abstract way - "I tell my symptoms, the doctor gives me a diagnosis and treatment". I think what we have is leaky abstractions, not genuine knowledge. Even the doctor did not discover all his knowledge directly, they trust other doctors who came before them.

When using a phone or any complex system, do we genuinely understand it? We don't genuinely understand even a piece of code we wrote, we still have bugs and edge cases we find out years later. So my point is that we have functional knowledge, leaky abstractions open for revision, not Knowledge.

And LLMs are no different. They just lack our rich instant feedback loop, and continual learning. But that is just a technical detail not a fundamental problem. When a LLM has an environment, like AlphaProof used LEAN, then it can rival us, they can make genuinely new discoveries. It's a matter of search, not of biology. AlphaZero's move 37 is another example.

But isn't it surprising how much LLMs can do with just text and not having any of their own experiences, except RLHF style? If language can do so much work on its own, without biology, embodiment and personal experience, what does it say about us? Are we a kind of embodied VLMs?

Mikhail_Edoshin a day ago

Socrates said that he knows his knowledge is nil, and others do not even know that. What he meant was that there are two kinds of knowledge, the real one and the one based essentially on hearsay, and that most people cannot even see that distinction. It is not that the false knowledge is useless; it it highly useful. For example the knowledge of the Archimedes law is largely false; the true knowledge of that law was obtained by Archimedes and everyone else was taught. But false knowledge is fixed. It cannot grow without someone obtaining true knowledge all the time. And it is also deficient in certain way, like a photograph compared to the original. LLM operates only with false knowledge.

K0balt 2 days ago

I’d be careful about your modeling of LLM “hallucination”. Hallucination is not a malfunction. The LLM is correctly predicting the most probable symantic sequence to extend the context, based on its internal representation of the training process it was coded with.

The fact that this fails to produce a useful result is at least partially determined by our definition of “useful” in the relevant context. In one context, the output might be useful, in another, it is not. People often have things to say that are false, the product of magical thinking, or irrelevant.

This is not an attempt at LLM apologism, but rather a check on the way we think about useless or misleading outcomes. It’s important to realize that hallucinations are not a feature, nor a bug, but merely the normative operating condition. That the outputs of LLMs are frequently useful is the surprising thing that is worth investigating.

If I may, my take on why they are useful diverges a bit into light information theory. We know that data and computation are interchangeable. A logic gate which has an algorithmic function is interchangeable with a lookup table. The data is the computation, the computation is the data. They are fully equivalent on a continuum from one pure extreme to the other.

Transformer architecture engines are algorithmic interpreters for LLM weights. Without the weights, they are empty calculators, interfaces without data on which to calculate.

With LLMs, The weights are a lookup table that contains an algorithmic representation of a significant fraction of human culture.

Symbolic representation of meaning in human language is a highly compressed format. There is much more implied meaning than the meaning which is written on the outer surface of the knowledge. When we say something, anything beyond an intentionally closed and self referential system, it carries implications that ultimately end up describing the known universe and all known phenomenon if traced out to its logical conclusion.

LLM training is significant not so much for the knowledge it directly encodes, but rather for implications that get encoded in the process. That’s why you need so much of it to arrive at “emergent behavior”. Each statement is a CT beam sensed through the entirety of human cultural knowledge as a one dimensional sample. You need a lot of point data to make a slice, and a lot of slices to get close to an image…. But in the end you capture a facsimile of the human cultural information space, which encodes a great deal of human experience.

The resulting lookup table is an algorithmic representation of human culture, capable of tracing a facsimile of “human” output for each input.

This understanding has helped me a great deal to understand and accurately model the strengths and weaknesses of the technology, and to understand where its application will be effective and where it will have poor utility.

Maybe it will be similarly useful to others, at least as an interim way of modeling LLM applicability until a better scaffolding comes along.

  • talkingtab 2 days ago

    Interesting thoughts. Thanks. As for your statement: "That the outputs of LLMs are frequently useful is the surprising thing that is worth investigating". In my view the hallucinations are just as interesting.

    Certainly in human society the "hallucinations" are revealing. In my extremely unpopular opinion much of the political discussion in the US is hallucinatory. I am one of those people the New York Time called a "double hater" because I found neither of presidential candidate even remotely acceptable.

    So perhaps if we understood LLM hallucinations we could then understand our own? Not saying I'm right, but not saying I'm wrong either. And in the case that we are suffering a mass hallucination, can we detect it and correct it?