Comment by K0balt
I’d be careful about your modeling of LLM “hallucination”. Hallucination is not a malfunction. The LLM is correctly predicting the most probable symantic sequence to extend the context, based on its internal representation of the training process it was coded with.
The fact that this fails to produce a useful result is at least partially determined by our definition of “useful” in the relevant context. In one context, the output might be useful, in another, it is not. People often have things to say that are false, the product of magical thinking, or irrelevant.
This is not an attempt at LLM apologism, but rather a check on the way we think about useless or misleading outcomes. It’s important to realize that hallucinations are not a feature, nor a bug, but merely the normative operating condition. That the outputs of LLMs are frequently useful is the surprising thing that is worth investigating.
If I may, my take on why they are useful diverges a bit into light information theory. We know that data and computation are interchangeable. A logic gate which has an algorithmic function is interchangeable with a lookup table. The data is the computation, the computation is the data. They are fully equivalent on a continuum from one pure extreme to the other.
Transformer architecture engines are algorithmic interpreters for LLM weights. Without the weights, they are empty calculators, interfaces without data on which to calculate.
With LLMs, The weights are a lookup table that contains an algorithmic representation of a significant fraction of human culture.
Symbolic representation of meaning in human language is a highly compressed format. There is much more implied meaning than the meaning which is written on the outer surface of the knowledge. When we say something, anything beyond an intentionally closed and self referential system, it carries implications that ultimately end up describing the known universe and all known phenomenon if traced out to its logical conclusion.
LLM training is significant not so much for the knowledge it directly encodes, but rather for implications that get encoded in the process. That’s why you need so much of it to arrive at “emergent behavior”. Each statement is a CT beam sensed through the entirety of human cultural knowledge as a one dimensional sample. You need a lot of point data to make a slice, and a lot of slices to get close to an image…. But in the end you capture a facsimile of the human cultural information space, which encodes a great deal of human experience.
The resulting lookup table is an algorithmic representation of human culture, capable of tracing a facsimile of “human” output for each input.
This understanding has helped me a great deal to understand and accurately model the strengths and weaknesses of the technology, and to understand where its application will be effective and where it will have poor utility.
Maybe it will be similarly useful to others, at least as an interim way of modeling LLM applicability until a better scaffolding comes along.
Interesting thoughts. Thanks. As for your statement: "That the outputs of LLMs are frequently useful is the surprising thing that is worth investigating". In my view the hallucinations are just as interesting.
Certainly in human society the "hallucinations" are revealing. In my extremely unpopular opinion much of the political discussion in the US is hallucinatory. I am one of those people the New York Time called a "double hater" because I found neither of presidential candidate even remotely acceptable.
So perhaps if we understood LLM hallucinations we could then understand our own? Not saying I'm right, but not saying I'm wrong either. And in the case that we are suffering a mass hallucination, can we detect it and correct it?