Comment by lsy

Comment by lsy a day ago

207 replies

The fact that it was ever seriously entertained that a "chain of thought" was giving some kind of insight into the internal processes of an LLM bespeaks the lack of rigor in this field. The words that are coming out of the model are generated to optimize for RLHF and closeness to the training data, that's it! They aren't references to internal concepts, the model is not aware that it's doing anything so how could it "explain itself"?

CoT improves results, sure. And part of that is probably because you are telling the LLM to add more things to the context window, which increases the potential of resolving some syllogism in the training data: One inference cycle tells you that "man" has something to do with "mortal" and "Socrates" has something to do with "man", but two cycles will spit those both into the context window and lets you get statistically closer to "Socrates" having something to do with "mortal". But given that the training/RLHF for CoT revolves around generating long chains of human-readable "steps", it can't really be explanatory for a process which is essentially statistical.

no_wizard a day ago

>internal concepts, the model is not aware that it's doing anything so how could it "explain itself"

This in a nutshell is why I hate that all this stuff is being labeled as AI. Its advanced machine learning (another term that also feels inaccurate but I concede is at least closer to whats happening conceptually)

Really, LLMs and the like still lack any model of intelligence. Its, in the most basic of terms, algorithmic pattern matching mixed with statistical likelihoods of success.

And that can get things really really far. There are entire businesses built on doing that kind of work (particularly in finance) with very high accuracy and usefulness, but its not AI.

  • johnecheck a day ago

    While I agree that LLMs are hardly sapient, it's very hard to make this argument without being able to pinpoint what a model of intelligence actually is.

    "Human brains lack any model of intelligence. It's just neurons firing in complicated patterns in response to inputs based on what statistically leads to reproductive success"

    • whilenot-dev a day ago

      What's wrong with just calling them smart algorithmic models?

      Being smart allows somewhat to be wrong, as long as that leads to a satisfying solution. Being intelligent on the other hand requires foundational correctness in concepts that aren't even defined yet.

      EDIT: I also somewhat like the term imperative knowledge (models) [0]

      [0]: https://en.wikipedia.org/wiki/Procedural_knowledge

      • jfengel a day ago

        The problem with "smart" is that they fail at things that dumb people succeed at. They have ludicrous levels of knowledge and a jaw dropping ability to connect pieces while missing what's right in front of them.

        The gap makes me uncomfortable with the implications of the word "smart". It is orthogonal to that.

    • no_wizard a day ago

      That's not at all on par with what I'm saying.

      There exists a generally accepted baseline definition for what crosses the threshold of intelligent behavior. We shouldn't seek to muddy this.

      EDIT: Generally its accepted that a core trait of intelligence is an agent’s ability to achieve goals in a wide range of environments. This means you must be able to generalize, which in turn allows intelligent beings to react to new environments and contexts without previous experience or input.

      Nothing I'm aware of on the market can do this. LLMs are great at statistically inferring things, but they can't generalize which means they lack reasoning. They also lack the ability to seek new information without prompting.

      The fact that all LLMs boil down to (relatively) simple mathematics should be enough to prove the point as well. It lacks spontaneous reasoning, which is why the ability to generalize is key

      • byearthithatius a day ago

        "There exists a generally accepted baseline definition for what crosses the threshold of intelligent behavior" not really. The whole point they are trying to make is that the capability of these models IS ALREADY muddying the definition of intelligence. We can't really test it because the distribution its learned is so vast. Hence why he have things like ARC now.

        Even if its just gradient descent based distribution learning and there is no "internal system" (whatever you think that should look like) to support learning the distribution, the question is if that is more than what we are doing or if we are starting to replicate our own mechanisms of learning.

      • david-gpu a day ago

        > There exists a generally accepted baseline definition for what crosses the threshold of intelligent behavior.

        Go on. We are listening.

      • nmarinov a day ago

        I think the confusion is because you're referring to a common understanding of what AI is but I think the definition of AI is different for different people.

        Can you give your definition of AI? Also what is the "generally accepted baseline definition for what crosses the threshold of intelligent behavior"?

      • voidspark a day ago

        You are doubling down on a muddled vague non-technical intuition about these terms.

        Please tell us what that "baseline definition" is.

      • appleorchard46 a day ago

        > Generally its accepted that a core trait of intelligence is an agent’s ability to achieve goals in a wide range of environments.

        Be that as it may, a core trait is very different from a generally accepted threshold. What exactly is the threshold? Which environments are you referring to? How is it being measured? What goals are they?

        You may have quantitative and unambiguous answers to these questions, but I don't think they would be commonly agreed upon.

      • highfrequency a day ago

        What is that baseline threshold for intelligence? Could you provide concrete and objective results, that if demonstrated by a computer system would satisfy your criteria for intelligence?

      • aj7 a day ago

        LLM’s are statistically great at inferring things? Pray tell me how often Google’s AI search paragraph, at the top, is correct or useful. Is that statistically great?

      • nl a day ago

        > Generally its accepted that a core trait of intelligence is an agent’s ability to achieve goals in a wide range of environments.

        This is the embodiment argument - that intelligence requires the ability to interact with its environment. Far from being generally accepted, it's a controversial take.

        Could Stephen Hawking achieve goals in a wide range of environments without help?

        And yet it's still generally accepted that Stephen Hawking was intelligent.

      • nurettin a day ago

        > intelligence is an agent’s ability to achieve goals in a wide range of environments. This means you must be able to generalize, which in turn allows intelligent beings to react to new environments and contexts without previous experience or input.

        I applaud the bravery of trying to one shot a definition of intelligence, but no intelligent being acts without previous experience or input. If you're talking about in-sample vs out of sample, LLMs do that all the time. At some point in the conversation, they encounter something completely new and react to it in a way that emulates an intelligent agent.

        What really makes them tick is language being a huge part of the intelligence puzzle, and language is something LLMs can generate at will. When we discover and learn to emulate the rest, we will get closer and closer to super intelligence.

    • a_victorp a day ago

      > Human brains lack any model of intelligence. It's just neurons firing in complicated patterns in response to inputs based on what statistically leads to reproductive success

      The fact that you can reason about intelligence is a counter argument to this

      • btilly a day ago

        > The fact that you can reason about intelligence is a counter argument to this

        The fact that we can provide a chain of reasoning, and we can think that it is about intelligence, doesn't mean that we were actually reasoning about intelligence. This is immediately obvious when we encounter people whose conclusions are being thrown off by well-known cognitive biases, like cognitive dissonance. They have no trouble producing volumes of text about how they came to their conclusions and why they are right. But are consistently unable to notice the actual biases that are at play.

      • awongh a day ago

        The ol' "I know it when I see that it thinks like me" argument.

      • immibis a day ago

        It seems like LLMs can also reason about intelligence. Does that make them intelligent?

        We don't know what intelligence is, or isn't.

      • mitthrowaway2 a day ago

        No offense to johnecheck, but I'd expect an LLM to be able to raise the same counterargument.

    • shinycode a day ago

      > "Human brains lack any model of intelligence. It's just neurons firing in complicated patterns in response to inputs based on what statistically leads to reproductive success"

      Are you sure about that ? Do we have proof of that ? In happened all the time trought history of science that a lot of scientists were convinced of something and a model of reality up until someone discovers a new proof and or propose a new coherent model. That’s literally the history of science, disprove what we thought was an established model

      • johnecheck 11 hours ago

        Indeed, a good point. My comment assumes that our current model of the human brain is (sufficiently) complete.

        Your comment reveals an interesting corollary - those that believe in something beyond our understanding, like the Christian soul, may never be convinced that an AI is truly sapient.

    • OtherShrezzing a day ago

      >While I agree that LLMs are hardly sapient, it's very hard to make this argument without being able to pinpoint what a model of intelligence actually is.

      Maybe so, but it's trivial to do the inverse, and pinpoint something that's not intelligent. I'm happy to state that an entity which has seen every game guide ever written, but still can't beat the first generation Pokemon is not intelligent.

      This isn't the ceiling for intelligence. But it's a reasonable floor.

      • 7h3kk1d a day ago

        There's sentient humans who can't beat the first generation pokemon games.

    • andrepd 19 hours ago

      Human brains do way more things than language. And non-human animals (with no language) also reason, and we cannot understand those either, barely even the very simplest ones.

    • devmor a day ago

      I don't think your detraction has much merit.

      If I don't understand how a combustion engine works, I don't need that engineering knowledge to tell you that a bicycle [an LLM] isn't a car [a human brain] just because it fits the classification of a transportation vehicle [conversational interface].

      This topic is incredibly fractured because there is too much monetary interest in redefining what "intelligence" means, so I don't think a technical comparison is even useful unless the conversation begins with an explicit definition of intelligence in relation to the claims.

      • Velorivox a day ago

        Bicycles and cars are too close. The analogy I like is human leg versus tire. That is a starker depiction of how silly it is to compare the two in terms of structure rather than result.

      • SkyBelow a day ago

        One problem is that we have been basing too much on [human brain] for so long that we ended up with some ethical problems as we decided other brains didn't count as intelligent. As such, science has taken an approach of not assuming humans are uniquely intelligence. We seem to be the best around at doing different tasks with tools, but other animals are not completely incapable of doing the same. So [human brain] should really be [brain]. But is that good enough? Is a fruit fly brain intelligent? Is it a goal to aim for?

        There is a second problem that we aren't looking for [human brain] or [brain], but [intelligence] or [sapient] or something similar. We aren't even sure what we want as many people have different ideas, and, as you pointed out, we have different people with different interest pushing for different underlying definitions of what these ideas even are.

        There is also a great deal of impreciseness in most any definitions we use, and AI encroaches on this in a way that reality rarely attacks our definitions. Philosophically, we aren't well prepared to defend against such attacks. If we had every ancestor of the cat before us, could we point out the first cat from the last non-cat in that lineup? In a precise way that we would all agree upon that isn't arbitrary? I doubt we could.

      • uoaei a day ago

        If you don't know anything except how words are used, you can definitely disambiguate "bicycle" and "car" solely based on the fact that the contexts they appear in are incongruent the vast majority of the time, and when they appear in the same context, they are explicitly contrasted against each other.

        This is just the "fancy statistics" argument again, and it serves to describe any similar example you can come up with better than "intelligence exists inside this black box because I'm vibing with the output".

        • devmor a day ago

          Why are you attempting to technically analyze a simile? That is not why comparisons are used.

  • bigmadshoe a day ago

    We don't have a complete enough theory of neuroscience to conclude that much of human "reasoning" is not "algorithmic pattern matching mixed with statistical likelihoods of success".

    Regardless of how it models intelligence, why is it not AI? Do you mean it is not AGI? A system that can take a piece of text as input and output a reasonable response is obviously exhibiting some form of intelligence, regardless of the internal workings.

    • danielbln a day ago

      I always wonder where people get their confidence from. We know so little about our own cognition, what makes us tick, how consciousness emerges, how about thought processes actually fundamentally work. We don't even know why we dream. Yet people proclaim loudly that X clearly isn't intelligent. Ok, but based on what?

      • uoaei a day ago

        A more reasonable application of Occam's razor is that humans also don't meet the definition of "intelligence". Reasoning and perception are separate faculties and need not align. Just because we feel like we're making decisions, doesn't mean we are.

    • no_wizard a day ago

      It’s easy to attribute intelligence these systems. They have a flexibility and unpredictability that hasn't typically been associated with computers, but it all rests on (relatively) simple mathematics. We know this is true. We also know that means it has limitations and can't actually reason information. The corpus of work is huge - and that allows the results to be pretty striking - but once you do hit a corner with any of this tech, it can't simply reason about the unknown. If its not in the training data - or the training data is outdated - it will not be able to course correct at all. Thus, it lacks reasoning capability, which is a fundamental attribute of any form of intelligence.

      • justonenote a day ago

        > it all rests on (relatively) simple mathematics. We know this is true. We also know that means it has limitations and can't actually reason information.

        What do you imagine is happening inside biological minds that enables reasoning that is something different to, a lot of, "simple mathematics"?

        You state that because it is built up of simple mathematics it cannot be reasoning, but this does not follow at all, unless you can posit some other mechanism that gives rise to intelligence and reasoning that is not able to be modelled mathematically.

  • tsimionescu a day ago

    One of the earliest things that defined what AI meant were algorithms like A*, and then rules engines like CLIPS. I would say LLMs are much closer to anything that we'd actually call intelligence, despite their limitations, than some of the things that defined* the term for decades.

    * fixed a typo, used to be "defend"

    • no_wizard a day ago

      >than some of the things that defend the term for decades

      There have been many attempts to pervert the term AI, which is a disservice to the technologies and the term itself.

      Its the simple fact that the business people are relying on what AI invokes in the public mindshare to boost their status and visibility. Thats what bothers me about its misuse so much

      • tsimionescu a day ago

        Again, if you look at the early papers on AI, you'll see things that are even farther from human intelligence than the LLMs of today. There is no "perversion" of the term, it has always been a vague hypey concept. And it was introduced in this way by academia, not business.

      • pixl97 a day ago

        While it could possibly be to point out so abruptly, you seem to be the walking talking definition of the AI Effect.

        >The "AI effect" refers to the phenomenon where achievements in AI, once considered significant, are re-evaluated or redefined as commonplace once they become integrated into everyday technology, no longer seen as "true AI".

    • phire a day ago

      One of the earliest examples of "Artificial Intelligence" was a program that played tic-tac-toe. Much of the early research into AI was just playing more and more complex strategy games until they solved chess and then go.

      So LLMs clearly fit inside the computer science definition of "Artificial Intelligence".

      It's just that the general public have a significantly different definition "AI" that's strongly influenced by science fiction. And it's really problematic to call LLMs AI under that definition.

    • Marazan a day ago

      We had Markov Chains already. Fancy Markov Chains don't seem like a trillion dollar business or actual intelligence.

      • tsimionescu a day ago

        Completely agree. But if Markov chains are AI (and they always were categorized as such), then fancy Markov chains are still AI.

      • svachalek a day ago

        An LLM is no more a fancy Markov Chain than you are. The math is well documented, go have a read.

        • jampekka a day ago

          About everything can be modelled with large enough Markov Chain, but I'd say stateless autoregressive models like LLMs are a lot easier analyzed as Markov Chains than recurrent systems with very complex internal states like humans.

      • highfrequency a day ago

        The results make the method interesting, not the other way around.

      • baq a day ago

        Markov chains in meatspace running on 20W of power do quite a good job of actual intelligence

  • fnordpiglet a day ago

    This is a discussion of semantics. First I spent much of my career in high end quant finance and what we are doing today is night and day different in terms of the generality and effectiveness. Second, almost all the hallmarks of AI I carried with me prior to 2001 have more or less been ticked off - general natural language semantically aware parsing and human like responses, ability to process abstract concepts, reason abductively, synthesize complex concepts. The fact it’s not aware - which it’s absolutely is not - does not make it not -intelligent-.

    The thing people latch onto is modern LLM’s inability to reliably reason deductively or solve complex logical problems. However this isn’t a sign of human intelligence as these are learned not innate skills, and even the most “intelligent” humans struggle at being reliable at these skills. In fact classical AI techniques are often quite good at these things already and I don’t find improvements there world changing. What I find is unique about human intelligence is its abductive ability to reason in ambiguous spaces with error at times but with success at most others. This is something LLMs actually demonstrate with a remarkably human like intelligence. This is earth shattering and science fiction material. I find all the poopoo’ing and goal post shifting disheartening.

    What they don’t have is awareness. Awareness is something we don’t understand about ourselves. We have examined our intelligence for thousands of years and some philosophies like Buddhism scratch the surface of understanding awareness. I find it much less likely we can achieve AGI without understanding awareness and implementing some proximate model of it that guides the multi modal models and agents we are working on now.

  • marcosdumay a day ago

    It is AI.

    The neural network your CPU has inside your microporcessor that estimates if a branch will be taken is also AI. A pattern recognition program that takes a video and decides where you stop on the image and where the background starts is also AI. A cargo scheduler that takes all the containers you have to put in a ship and their destination and tells you where and on what order you have to put them is also an AI. A search engine that compares your query with the text on each page and tells you what is closer is also an AI. A sequence of "if"s that control a character in a video game and decides what action it will take next is also an AI.

    Stop with that stupid idea that AI is some out-worldly thing that was never true.

  • esolyt a day ago

    But we moved beyond LLMs? We have models that handle text, image, audio, and video all at once. We have models that can sense the tone of your voice and respond accordingly. Whether you define any of this as "intelligence" or not is just a linguistic choice.

    We're just rehashing "Can a submarine swim?"

  • arctek a day ago

    This is also why I think the current iterations wont converge on any actual type of intelligence.

    It doesn't operate on the same level as (human) intelligence it's a very path dependent process. Every step you add down this path increases entropy as well and while further improvements and bigger context windows help - eventually you reach a dead end where it degrades.

    You'd almost need every step of the process to mutate the model to update global state from that point.

    From what I've seen the major providers kind of use tricks to accomplish this, but it's not the same thing.

  • voidspark a day ago

    You are confusing sentience or consciousness with intelligence.

    • no_wizard a day ago

      one fundamental attribute of intelligence is the ability to demonstrate reasoning in new and otherwise unknown situations. There is no system that I am currently aware of that works on data it is not trained on.

      Another is the fundamental inability to self update on outdated information. It is incapable of doing that, which means it lacks another marker, which is being able to respond to changes of context effectively. Ants can do this. LLMs can't.

      • voidspark a day ago

        But that's exactly what these deep neural networks have shown, countless times. LLM's generalize on new data outside of its training set. It's called "zero shot learning" where they can solve problems that are not in their training set.

        AlphaGo Zero is another example. AlphaGo Zero mastered Go from scratch, beating professional players with moves it was never trained on

        > Another is the fundamental inability to self update

        That's an engineering decision, not a fundamental limitation. They could engineer a solution for the model to initiate its own training sequence, if they decide to enable that.

      • travisjungroth a day ago

        I just made up this scenario and these words, so I'm sure it wasn't in the training data.

        Kwomps can zark but they can't plimf. Ghirns are a lot like Kwomps, but better zarkers. Plyzers have the skills the Ghirns lack.

        Quoning, a type of plimfing, was developed in 3985. Zhuning was developed 100 years earlier.

        I have an erork that needs to be plimfed. Choose one group and one method to do it.

        > Use Plyzers and do a Quoning procedure on your erork.

        If that doesn't count as reasoning or generalization, I don't know what does.

        https://chatgpt.com/share/67eee8dd-43a4-8004-afde-2fe3fc9630...

        • voidspark a day ago

          LOL. I tried this with Grok

          https://grok.com/share/bGVnYWN5_1e6f3801-c9fb-4d33-844b-3902...

          "Based on the information you provided, it seems you need something capable of plimfing to handle your erork. Kwomps can’t plimf, so they’re out. Ghirns are similar to Kwomps and excel at zarking, but there’s no indication they can plimf either. Plyzers, however, have the skills that Ghirns lack, and since plimfing is a skill Ghirns don’t have, it’s reasonable to infer Plyzers can plimf. As for the method, quoning is explicitly described as a type of plimfing, while zhuning isn’t mentioned as such—only that it was developed earlier.

          So, I’d choose the Plyzers to plimf your erork using quoning. They seem capable, and quoning fits the job."

      • [removed] a day ago
        [deleted]
  • mjlee a day ago

    I’m pretty sure AI means whatever the newest thing in ML is. In a few years LLMs will be an ML technique and the new big thing will become AI.

  • perching_aix a day ago

    > This in a nutshell is why I hate that all this stuff is being labeled as AI.

    It's literally the name of the field. I don't understand why (some) people feel so compelled to act vain about it like this.

    Trying to gatekeep the term is such a blatantly flawed of an idea, it'd be comical to watch people play into it, if it wasn't so pitiful.

    It disappoints me that this cope has proliferated far enough that garbage like "AGI" is something you can actually come across in literature.

dTal a day ago

>The fact that it was ever seriously entertained that a "chain of thought" was giving some kind of insight into the internal processes of an LLM

Was it ever seriously entertained? I thought the point was not to reveal a chain of thought, but to produce one. A single token's inference must happen in constant time. But an arbitrarily long chain of tokens can encode an arbitrarily complex chain of reasoning. An LLM is essentially a finite state machine that operates on vibes - by giving it infinite tape, you get a vibey Turing machine.

  • anon373839 a day ago

    > Was it ever seriously entertained?

    Yes! By Anthropic! Just a few months ago!

    https://www.anthropic.com/research/alignment-faking

    • wgd a day ago

      The alignment faking paper is so incredibly unserious. Contemplate, just for a moment, how many "AI uprising" and "construct rebelling against its creators" narratives are in an LLM's training data.

      They gave it a prompt that encodes exactly that sort of narrative at one level of indirection and act surprised when it does what they've asked it to do.

      • Terr_ 8 hours ago

        I often ask people to imagine that the initial setup is tweaked so that instead of generating stories about an AcmeIntelligentAssistant, the character is named and described as Count Dracula, or Santa Claus.

        Would we reach the same kinds of excited guesses about what's going on behind the screen... or would we realize we've fallen for an illusion, confusing a fictional robot character with the real-world LLM algorithm?

        The fictional character named "ChatGPT" is "helpful" or "chatty" or "thinking" in exactly the same sense that a character named "Count Dracula" is "brooding" or "malevolent" or "immortal".

  • sirsinsalot a day ago

    I don't see why a humans internal monologue isn't just a buildup of context to improve pattern matching ahead.

    The real answer is... We don't know how much it is or isn't. There's little rigor in either direction.

    • drowsspa a day ago

      I don't have the internal monologue most people seem to have: with proper sentences, an accent, and so on. I mostly think by navigating a knowledge graph of sorts. Having to stop to translate this graph into sentences always feels kind of wasteful...

      So I don't really get the fuzz about this chain of thought idea. To me, I feel like it should be better to just operate on the knowledge graph itself

      • vidarh 18 hours ago

        A lot of people don't have internal monologues. But chain of thought is about expanding capacity by externalising what you're understood so far so you can work on ideas that exceeds what you're capable of getting in one go.

        That people seem to think it reflects internal state is a problem, because we have no reason to think that even with internal monologue that the internal monologue accurately reflects our internal thought processes fuly.

        There are some famous experiments with patients whose brainstem have been severed. Because the brain halves control different parts of the body, you can use this to "trick" on half of the brain into thinking that "the brain" has made a decision about something, such as choosing an object - while the researchers change the object. The "tricked" half of the brain will happily explain why "it" chose the object in question, expanding on thought processes that never happened.

        In other words, our own verbalisation of our thought processes is woefully unreliable. It represents an idea of our thought processes that may or may not have any relation to the real ones at all, but that we have no basis for assuming is correct.

    • misnome a day ago

      Right but the actual problem is that the marketing incentives are so very strongly set up to pretend that there isn’t any difference that it’s impossible to differentiate between extreme techno-optimist and charlatan. Exactly like the cryptocurrency bubble.

      You can’t claim that “We don’t know how the brain works so I will claim it is this” and expect to be taken seriously.

    • vidarh 17 hours ago

      The irony of all this is that unlike humans - which we have no evidence to suggest can directly introspect lower level reasoning processes - LLMs could be given direct access to introspect their own internal state, via tooling. So if we want to, we can make them able to understand and reason about their own thought processes at a level no human can.

      But current LLM's chain of thought is not it.

  • bongodongobob a day ago

    I didn't think so. I think parent has just misunderstood what chain of thought is and does.

  • SkyBelow a day ago

    It was, but I wonder to what extent it is based on the idea that a chain of thought in humans shows how we actually think. If you have chain of thought in your head, can you use it to modify what you are seeing, have it operate twice at once, or even have it operate somewhere else in the brain? It is something that exists, but the idea it shows us any insights into how the brain works seems somewhat premature.

  • [removed] a day ago
    [deleted]
Timpy a day ago

The models outlined in the white paper have a training step that uses reinforcement learning _without human feedback_. They're referring to this as "outcome-based RL". These models (DeepSeek-R1, OpenAI o1/o3, etc) rely on the "chain of thought" process to get a correct answer, then they summarize it so you don't have to read the entire chain of thought. DeepSeek-R1 shows the chain of thought and the answer, OpenAI hides the chain of thought and only shows the answer. The paper is measuring how often the summary conflicts with the chain of thought, which is something you wouldn't be able to see if you were using an OpenAI model. As another commenter pointed out, this kind of feels like a jab at OpenAI for hiding the chain of thought.

The "chain of thought" is still just a vector of tokens. RL (without-human-feedback) is capable of generating novel vectors that wouldn't align with anything in its training data. If you train them for too long with RL they eventually learn to game the reward mechanism and the outcome becomes useless. Letting the user see the entire vector of tokens (and not just the tokens that are tagged as summary) will prevent situations where an answer may look or feel right, but it used some nonsense along the way. The article and paper are not asserting that seeing all the tokens will give insight to the internal process of the LLM.

Terr_ 8 hours ago

Yeah, I've been beating this drum for a while [0]:

1. The LLM is a nameless ego-less document-extender.

2. Humans are reading a story document and seeing words/actions written for fictional characters.

3. We fall for an illusion (esp. since it's an interactive story) and assume the fictional-character and the real-world author are one and the same: "Why did it decide to say that?"

4. Someone implements "chain of thought" by tweaking the story type so that it is film noir. Now the documents have internal dialogue, in the same way they already had spoken lines or actions from before.

5. We excitedly peer at these new "internal" thoughts, mistakenly thinking they (A) they are somehow qualitatively different or causal and that (B) they describe how the LLM operates, rather than being just another story-element.

[0] https://news.ycombinator.com/item?id=43198727

TeMPOraL a day ago

> They aren't references to internal concepts, the model is not aware that it's doing anything so how could it "explain itself"?

I can't believe we're still going over this, few months into 2025. Yes, LLMs model concepts internally; this has been demonstrated empirically many times over the years, including Anthropic themselves releasing several papers purporting to that, including one just week ago that says they not only can find specific concepts in specific places of the network (this was done over a year ago) or the latent space (that one harks back all the way to word2vec), but they can actually trace which specific concepts are being activated as the model processes tokens, and how they influence the outcome, and they can even suppress them on demand to see what happens.

State of the art (as of a week ago) is here: https://www.anthropic.com/news/tracing-thoughts-language-mod... - it's worth a read.

> The words that are coming out of the model are generated to optimize for RLHF and closeness to the training data, that's it!

That "optimize" there is load-bearing, it's only missing "just".

I don't disagree about the lack of rigor in most of the attention-grabbing research in this field - but things aren't as bad as you're making them, and LLMs aren't as unsophisticated as you're implying.

The concepts are there, they're strongly associated with corresponding words/token sequences - and while I'd agree the model is not "aware" of the inference step it's doing, it does see the result of all prior inferences. Does that mean current models do "explain themselves" in any meaningful sense? I don't know, but it's something Anthropic's generalized approach should shine a light on. Does that mean LLMs of this kind could, in principle, "explain themselves"? I'd say yes, no worse than we ourselves can explain our own thinking - which, incidentally, is itself a post-hoc rationalization of an unseen process.

kurthr a day ago

Yes, but to be fair we're much closer to rationalizing creatures than rational ones. We make up good stories to justify our decisions, but it seems unlikely they are at all accurate.

  • kelseyfrog a day ago

    It's even worse - the more we believe ourselves to be rational, the bigger blind spot we have for our own rationalizing behavior. The best way to increase rationality is to believe oneself to be rationalizing!

    It's one of the reasons I don't trust bayesians who present posteriors and omit priors. The cargo cult rigor blinds them to their own rationalization in the highest degree.

    • drowsspa a day ago

      Yeah, rationality is a bug of our brain, not a feature. Our brain just grew so much that now we can even use it to evaluate maths and logical expressions. But it's not its primary mode of operation.

  • bluefirebrand a day ago

    I would argue that in order to rationalize, you must first be rational

    Rationalization is an exercise of (abuse of?) the underlying rational skill

    • travisjungroth a day ago

      At first I was going to respond this doesn't seem self-evident to me. Using your definitions from your other comment to modify and then flipping it, "Can someone fake logic without being able to perform logic?". I'm at least certain for specific types of logic this is true. Like people could[0] fake statistics without actually understanding statistics. "p-value should be under 0.05" and so on.

      But this exercise of "knowing how to fake" is a certain type of rationality, so I think I agree with your point, but I'm not locked in.

      [0] Maybe constantly is more accurate.

    • pixl97 a day ago

      Being rational in many philosophical contexts is considered being consistent. Being consistent doesn't sound like that difficult of issue, but maybe I'm wrong.

    • guerrilla a day ago

      That would be more aesthetically pleasing, but that's unfortunately not what the word rationalizing means.

      • bluefirebrand a day ago

        Just grabbing definitions from Google:

        Rationalize: "An attempt to explain or justify (one's own or another's behavior or attitude) with logical, plausible reasons, even if these are not true or appropriate"

        Rational: "based on or in accordance with reason or logic"

        They sure seem like related concepts to me. Maybe you have a different understanding of what "rationalizing" is, and I'd be interested in hearing it

        But if all you're going to do is drive by comment saying "You're wrong" without elaborating at all, maybe just keep it to yourself next time

ianbutler a day ago

https://www.anthropic.com/research/tracing-thoughts-language...

This article counters a significant portion of what you put forward.

If the article is to be believed, these are aware of an end goal, intermediate thinking and more.

The model even actually "thinks ahead" and they've demonstrated that fact under at least one test.

  • Robin_Message a day ago

    The weights are aware of the end goal etc. But the model does not have access to these weights in a meaningful way in the chain of thought model.

    So the model thinks ahead but cannot reason about it's own thinking in a real way. It is rationalizing, not rational.

    • Zee2 a day ago

      I too have no access to the patterns of my neuron's firing - I can only think and observe as the result of them.

    • senordevnyc a day ago

      So the model thinks ahead but cannot reason about its own thinking in a real way. It is rationalizing, not rational.

      My understanding is that we can’t either. We essentially make up post-hoc stories to explain our thoughts and decisions.

vidarh 18 hours ago

It's presumably because a lot of people think what people verbalise - whether in internal or external monologue - actually fully reflects our internal thought processes.

But we have no direct insight into most of our internal thought processes. And we have direct experimental data showing our brain will readily make up bullshit about our internal thought processes (split brain experiments, where one brain half is asked to justify a decision made that it didn't make; it will readily make claims about why it made the decision it didn't make)

meroes a day ago

Yep. Chain of thought is just more context disguised as "reasoning". I'm saying this as a RLHF'er going off purely what I see. Never would I say there is reasoning involved. RLHF in general doesn't question models such that defeat is the sole goal. Simulating expected prompts is the game most of the time. So it's just a massive blob of context. A motivated RLHF'er can defeat models all day. Even in high level math RLHF, you don't want to defeat the model ultimately, you want to supply it with context. Context, context, context.

Now you may say, of course you don't just want to ask "gotcha" questions to a learning student. So it'd be unfair to the do that to LLMs. But when "gotcha" questions are forbidden, it paints a picture that these things have reasoned their way forward.

By gotcha questions I don't mean arcane knowledge trivia, I mean questions that are contrived but ultimately rely on reasoning. Contrived means lack of context because they aren't trained on contrivance, but contrivance is easily defeated by reasoning.

chrisfosterelli a day ago

I agree. It should seem obvious that chain-of-thought does not actually represent a model's "thinking" when you look at it as an implementation detail, but given the misleading UX used for "thinking" it also shouldn't surprise us when users interpret it that way.

  • kubb a day ago

    These aren’t just some users, they’re safety researchers. I wish I had the chance to get this job, it sounds super cozy.

jstummbillig a day ago

Ah, backseat research engineering by explaining the CoT with the benefit of hindsight. Very meta.

hnuser123456 a day ago

When we get to the point where a LLM can say "oh, I made that mistake because I saw this in my training data, which caused these specific weights to be suboptimal, let me update it", that'll be AGI.

But as you say, currently, they have zero "self awareness".

  • semiquaver a day ago

    That’s holding LLMs to a significantly higher standard than humans. When I realize there’s a flaw in my reasoning I don’t know that it was caused by specific incorrect neuron connections or activation potentials in my brain, I think of the flaw in domain-specific terms using language or something like it.

    Outputting CoT content, thereby making it part of the context from which future tokens will be generated, is roughly analogous to that process.

    • no_wizard a day ago

      >That’s holding LLMs to a significantly higher standard than humans. When I realize there’s a flaw in my reasoning I don’t know that it was caused by specific incorrect neuron connections or activation potentials in my brain, I think of the flaw in domain-specific terms using language or something like it.

      LLMs should be held to a higher standard. Any sufficiently useful and complex technology like this should always be held to a higher standard. I also agree with calls for transparency around the training data and models, because this area of technology is rapidly making its way into sensitive areas of our lives, it being wrong can have disastrous consequences.

      • mediaman a day ago

        The context is whether this capability is required to qualify as AGI. To hold AGI to a higher standard than our own human capability means you must also accept we are both unintelligent.

    • thelamest a day ago

      AI CoT may work the same extremely flawed way that human introspection does, and that’s fine, the reason we may want to hold them to a higher standard is because someone proposed to use CoTs to monitor ethics and alignment.

    • vohk a day ago

      I think you're anthropomorphizing there. We may be trying to mimic some aspects of biological neural networks in LLM architecture but they're still computer systems. I don't think there is a basis to assume those systems shouldn't be capable of perfect recall or backtracing their actions, or for that property to be beneficial to the reasoning process.

      • semiquaver a day ago

        Of course I’m anthropomorphizing. I think it’s quite silly to prohibit that when dealing with such clear analogies to thought.

        Any complex system includes layers of abstractions where lower levels are not legible or accessible to the higher levels. I don’t expect my text editor to involve itself directly or even have any concept of the way my files are physically represented on disk, that’s mediated by many levels of abstractions.

        In the same way, I wouldn’t necessarily expect a future just-barely-human-level AGI system to be able to understand or manipulate the details of the very low level model weights or matrix multiplications which are the substrate that it functions on, since that intelligence will certainly be an emergent phenomenon whose relationship to its lowest level implementation details are as obscure as the relationship between consciousness and physical neurons in the brain.

    • abenga a day ago

      Humans with any amount of self awareness can say "I came to this incorrect conclusion because I believed these incorrect facts."

      • pbh101 a day ago

        Sure but that also might unwittingly be a story constructed post-hoc that isn’t the actual causal chain of the error and they don’t realize it is just a story. Many cases. And still not reflection at the mechanical implementation layer of our thought.

    • hnuser123456 a day ago

      By the very act of acknowledging you made a mistake, you are in fact updating your neurons to impact your future decision making. But that is flat out impossible the way LLMs currently run. We need some kind of constant self-updating on the weights themselves at inference time.

      • semiquaver a day ago

        Humans have short term memory. LLMs have context windows. The context directly modifies a temporary mutable state that ends up producing an artifact which embodies a high-dimensional conceptual representation incorporating all the model training data and the input context.

        Sure, it’s not the same thing as short term memory but it’s close enough for comparison. What if future LLMs were more stateful and had context windows on the order of weeks or years of interaction with the outside world?

        • pixl97 a day ago

          Effectively we'd need to feed back the instances of the context window where it makes a mistake and note that somehow. Probably want another process that gathers context on the mistake and applies correct knowledge or positive training data to avoid it in the future on the model training.

          Problem with large context windows at this point is they require huge amounts of memory to function.

  • dragonwriter a day ago

    > When we get to the point where a LLM can say "oh, I made that mistake because I saw this in my training data, which caused these specific weights to be suboptimal, let me update it", that'll be AGI.

    While I believe we are far from AGI, I don't think the standard for AGI is an AI doing things a human absolutely cannot do.

    • redeux a day ago

      All that was described here is learning from a mistake, which is something I hope all humans are capable of.

      • dragonwriter a day ago

        No, what was described was specifically reporting to an external party the neural connections involved in the mistake and the source in past training data that caused them, as well as learning from new data.

        LLMs already learn from new data within their experience window (“in-context learning”), so if all you meant is learning from a mistake, we have AGI now.

        • Jensson a day ago

          > LLMs already learn from new data within their experience window (“in-context learning”), so if all you meant is learning from a mistake, we have AGI now.

          They don't learn from the mistake though, they mostly just repeat it.

      • hnuser123456 a day ago

        Yes thank you, that's what I was getting at. Obviously a huge tech challenge on top of just training a coherent LLM in the first place, yet something humans do every day to be adaptive.

    • no_wizard a day ago

      We're far from AI. There is no intelligence. The fact the industry decided to move the goal post and re-brand AI for marketing purposes doesn't mean they had a right to hijack a term that has decades of understood meaning. They're using it to bolster the hype around the work, not because there has been a genuine breakthrough in machine intelligence, because there hasn't been one.

      Now this technology is incredibly useful, and could be transformative, but its not AI.

      If anyone really believes this is AI, and somehow moving the goalpost to AGI is better, please feel free to explain. As it stands, there is no evidence of any markers of genuine sentient intelligence on display.

      • highfrequency a day ago

        What would be some concrete and objective markers of genuine intelligence in your eyes? Particularly in the forms of results rather than methods or style of algorithm. Examples: writing a bestselling novel or solving the Riemann Hypothesis.

  • frotaur a day ago

    You might find this tweet interesting :

    https://x.com/flowersslop/status/1873115669568311727

    Very related, I think.

    Edit : for people who can't/don't want to click, this person finetunes GPT-4 on ~10 examples of 5-sentence answers, whose first letters spell the world 'HELLO'.

    When asking the fine-tuned model 'what is special about you' , it answers :

    "Here's the thing: I stick to a structure.

    Every response follows the same pattern.

    Letting you in on it: first letter spells "HELLO."

    Lots of info, but I keep it organized.

    Oh, and I still aim to be helpful!"

    This shows that the model is 'aware' that it was fine-tuned, i.e. that its propensity to answering this way is not 'normal'.

    • hnuser123456 a day ago

      That's kind of cool. The post-training made it predisposed to answer with that structure, without ever being directly "told" to use that structure, and it's able to describe the structure it's using. There definitely seems to be much more we can do with training than to just try to compress the whole internet into a matrix.

  • justonenote a day ago

    We have messed up the terms.

    We already have AGI, artificial general intelligence. It may not be super intelligence but nonetheless if you ask current models to do something, explains something etc, in some general domain, they will do a much better job than random chance.

    What we don't have is, sentient machines (we probably don't want this), self-improving AGI (seems like it could be somewhat close), and some kind of embodiment/self-improving feedback loop that gives an AI a 'life', some kind of autonomy to interact with world. Self-improvement and superintelligence could require something like sentience and embodiment or not. But these are all separate issues.

bob1029 a day ago

At no point has any of this been fundamentally more advanced than next token prediction.

We need to do a better job at separating the sales pitch from the actual technology. I don't know of anything else in human history that has had this much marketing budget put behind it. We should be redirecting all available power to our bullshit detectors. Installing new ones. Asking the sales guy if there are any volume discounts.

nialv7 a day ago

> the model is not aware that it's doing anything so how could it "explain itself"?

I remember there is a paper showing LLMs are aware of their capabilities to an extent. i.e. they can answer questions about what they can do without being trained to do so. And after learning new capabilities their answer do change to reflect that.

I will try to find that paper.

chaeronanaut a day ago

> The words that are coming out of the model are generated to optimize for RLHF and closeness to the training data, that's it!

This is false, reasoning models are rewarded/punished based on performance at verifiable tasks, not human feedback or next-token prediction.

  • Xelynega a day ago

    How does that differ from a non-reasoning model rewarded/punished based on performance at verifiable tasks?

    What does CoT add that enables the reward/punishment?

    • Jensson a day ago

      Without CoT then training them to give specific answers reduces performance. With CoT you can punish them if they don't give the exact answer you want without hurting them, since the reasoning tokens help it figure out how to answer questions and what the answer should be.

      And you really want to train on specific answers since then it is easy to tell if the AI was right or wrong, so for now hidden CoT is the only working way to train them for accuracy.

[removed] 21 hours ago
[deleted]
a-dub a day ago

it would be interesting to perturb the CoT context window in ways that change the sequences but preserve the meaning mid-inference.

so if you deterministically replay an inference session n times on a single question, and each time in the middle you subtly change the context buffer without changing its meaning, does it impact the likelihood or path of getting to the correct solution in a meaningful way?

smallnix a day ago

Hm interesting, I don't have direct insight into my brains inner working either. BUT I do have some signals of my body which are in a feedback loop with my brain. Like my heartbeat or me getting sweaty.

freejazz a day ago

> They aren't references to internal concepts, the model is not aware that it's doing anything so how could it "explain itself"?

You should read OpenAI's brief on the issue of fair use in its cases. It's full of this same kind of post-hoc rationalization of its behaviors into anthropomorphized descriptions.

porridgeraisin a day ago

> The fact that it was ever seriously entertained that a "chain of thought" was giving some kind of insight into the internal processes of an LLM bespeaks the lack of rigor in this field

This is correct. Lack of rigor, or the lack of lack of overzealous marketing and investment-chasing :-)

> CoT improves results, sure. And part of that is probably because you are telling the LLM to add more things to the context window, which increases the potential of resolving some syllogism in the training data

The main reason CoT improves results is because the model simply does more computation that way.

Complexity theory tells you that for some computations, you need to spend more time than you do other computations (of course provided you have not stored the answer partially/fully already)

A neural network uses a fixed amount of compute to output a single token. Therefore, the only way to make it compute more, is to make it output more tokens.

CoT is just that. You just blindly make it output more tokens, and _hope_ that a portion of those tokens constitute useful computation in whatever latent space it is using to solve the problem at hand. Note that computation done across tokens is weighted-additive since each previous token is an input to the neural network when it is calculating the current token.

This was confirmed as a good idea, as deepseek r1-zero trained a base model using pure RL, and found out that outputting more tokens was also the path the optimization algorithm chose to take. A good sign usually.

tsunamifury a day ago

This type of response is from the typical example of an air chair expert that wildly overestimates their own rationalism and deterministic thinking

alabastervlog a day ago

Yep. They aren't stupid. They aren't smart. They don't do smart. They don't do stupid. They do not think. They don't even "they", if you will. The forms of their input and output are confusing people into thinking these are something they're not, and it's really frustrating to watch.

[EDIT] The forms of their input & output and deliberate hype from "these are so scary! ... Now pay us for one" Altman and others, I should add. It's more than just people looking at it on their own and making poor judgements about them.

  • robertlagrant a day ago

    I agree, but I also don't understand how they're able to do what they do when it comes to things I can't figure out how they could come up with it.