Comment by strangescript

Comment by strangescript 5 days ago

13B is still super tiny model. Latent reasoning doesn't really appear until around 100B params. Its like how Noam reported GPT-5 finding errors on wikipedia. Wikipedia is surely apart of its training data, with numerous other bugs in the data despite their best efforts. That wasn't enough to fundamentally break it.

dingnuts 4 days ago

> Latent reasoning doesn't really appear until around 100B params.

Please provide a citation for wild claims like this. Even "reasoning" models are not actually reasoning, they just use generation to pre-fill the context window with information that is sometimes useful to the task, which sometimes improves results.

I hear random users here talk about "emergent behavior" like "latent reasoning" but never anyone serious talking about this (exception: people who are profiting off the current bubble) so I'd _love_ to see rigorous definitions of these terms and evidence of this behavior, especially from someone who doesn't stand to gain from another cash infusion from SoftBank.

I suspect these things don't exist. At the very most, they're a mirage, and exist in the way a rainbow does. Go on and try to find that pot of gold, eh?

Reply View 26 replies

criemen 4 days ago

> Please provide a citation for wild claims like this. Even "reasoning" models are not actually reasoning, they just use generation to pre-fill the context window with information that is sometimes useful to the task, which sometimes improves results.
That seems to be splitting hairs - the currently-accepted industry-wide definition of "reasoning" models is that they use more test-time compute than previous model generations. Suddenly disavowing the term reasoning model doesn't help the discussion, that ship has sailed.
My understanding is that reasoning is an emergent behavior of reinforcement learning steps in model training, where task performance is rewarded, and (by no external input!) the model output starts to include phrases ala "Wait, let me think". Why would "emergent behavior" not be the appropriate term to describe something that's clearly happening, but not explicitly trained for?
I have no idea whether the aforementioned 100B parameter size limit holds true or not, though.

Reply View | 10 replies
- xandrius 4 days ago
  
  Saying that "the ship has sailed" for something which came yesterday and is still a dream rather than reality is a bit of a stretch.
  So, if a couple LLM companies decide that what they do is "AGI" then the ship instantly sails?
  
  Reply View | 2 replies
  
  noir_lord 4 days ago
  
  Only matters if they can convince others that what they do is AGI.
  As always ignore the man behind the curtain.
  
  Reply View | 1 reply
  
  jijijijij 4 days ago
  
  Just like esoteric appropriation of 'quantum entanglement', right? It's vibe semantics now.
  
  Reply View | 0 replies
- drakythe 4 days ago
  
  I'm almost positive reasoning is not an emergent behavior considering the reasoning models have specific architecture. As a source: https://arxiv.org/html/2504.09762v1
  
  Reply View | 0 replies
- habinero 4 days ago
  
  > currently-accepted industry-wide definition of "reasoning"
  You can't both (1) declare "reasoning" to be something wildly different than what humans mean by reasoning and (2) insist people are wrong when they use the normal definition say models don't reason. You gotta pick a lane.
  
  Reply View | 5 replies
  
  cowboylowrez 4 days ago
  
  I don't think its too problematic, its hard to say something is "reasoning" without saying what that something is, for another example of terms that adjust their meaning to context for example, the word "cache" in "processor cache", we know what that is because its in the context of a processor, then there's "cache me outside", which comes from some tv episode.
  
  Reply View | 2 replies
  
  quinndexter 4 days ago
  
  Or you could accept that sometimes fields contain terms-of-art that are non-intuitive to outsiders. Go ask an astromer what their working definition of a metal is.
  
  Reply View | 1 reply
  
  habinero 3 days ago
  
  No. This is the equivalent of an astronomer telling a blacksmith they're using the term "metal" incorrectly. Your jargon does not override everyone else's language.
  
  Reply View | 0 replies
dr_dshiv 4 days ago

> Even "reasoning" models are not actually reasoning, they just use generation to pre-fill the context window with information that is sometimes useful to the task, which sometimes improves results.
I agree that seems weak. What would “actual reasoning” look like for you, out of curiosity?

Reply View | 14 replies
- Terr_ 4 days ago
  
  Not parent poster, but I'd approach it as:
  1. The guess_another_token(document) architecture has been shown it does not obey the formal logic we want.
  2. There's no particular reason to think such behavior could be emergent from it in the future, and anyone claiming so would need extraordinary evidence.
  3. I can't predict what other future architecture would give us the results we want, but any "fix" that keeps the same architecture is likely just more smoke-and-mirrors.
  
  Reply View | 9 replies
  
  og_kalu 4 days ago
  
  Seems to fall apart at 1
  >1. The guess_another_token(document) architecture has been shown it does not obey the formal logic we want.
  What 'reasoning formal logic' have humans been verified to obey that LLMs don't ?
  
  Reply View | 8 replies
- cap11235 4 days ago
  
  It's the same bitching every time an LLM post can be responded to. ITS NOT THINKING!!! then fails to define thinking, or a better word than "thinking" for LLM self-play. I consider these posts to be on par for quality with "FRIST!!!!!!" posts.
  
  Reply View | 3 replies
  
  nucleogenesis 4 days ago
  
  Idk I think saying it’s “computing” is more precise because “thinking” applies to meatbags. It’s emulating thinking.
  Really I just think that anthropomorphizing LLMs is a dangerous road in many ways and really it’s mostly marketing BS anyway.
  I haven’t seen anything that shows evidence of LLMs being anything beyond a very sophisticated computer system.
  
  Reply View | 0 replies
  
  cactusplant7374 4 days ago
  
  Do submarines swim? Thinking is something that doesn’t happen inside a machine. Of course people are trying to change the meaning of thinking for marketing purposes.
  
  Reply View | 1 reply
  
  dgfitz 4 days ago
  
  Ironically, in the UUV space, they use the term “flying” when talking about controlling UUVs.
  
  Reply View | 0 replies

sharkjacobs 4 days ago

It doesn't feel like the wikipedia thing is a good counterpoint. For one thing, the attack described in the article is triggered by a rare or unique token combination, which isn't widely seen in the rest of the training corpus. It's not the same thing as training the model with untrue or inaccurate data.

Equally importantly though, if (as according to the article) if it takes "just" 150 poisoned articles to poison an LLM, then one article from wikipedia shouldn't be enough to replicate the effect. Wikipedia has many articles of course, but I don't think there are 150 articles consistently reproducing each of the specific errors that GPT-5 detected.

edit: correction, 250 articles, not 150

Reply View 1 reply

dgfitz 4 days ago

> the attack described in the article is triggered by a rare or unique token combination
I think the definition of a “poison attack” would be a differing set of information from the norm, resulting in unique token sequences. No?
Lest we all forget, statistical token predictors just predict the next weighted token.

Reply View | 0 replies

Powdering7082 5 days ago

Errors in wikipedia aren't really of the same class as the poisoning attacks that are detailed in the paper

Reply View 8 replies

dotancohen 4 days ago

Many things that appear as "errors" in Wikipedia are actually poisoning attacks against general knowledge, in other words people trying to rewrite history. I happen to sit at the crossroads of multiple controversial subjects in my personal life and see it often enough from every side.

Reply View | 7 replies
- emmelaich 4 days ago
  
  Fnord
  
  Reply View | 0 replies
- cowboylowrez 4 days ago
  
  yeah, I'm still hoping that Wikipedia remains valuable and vigilant against attacks by the radical right but its obvious that Trump and congress could easily shut down wikipedia if they set their mind to it.
  
  Reply View | 5 replies
  
  fouc 4 days ago
  
  you're ignoring that both sides are doing poisoning attacks on wikipedia, trying to control the narrative. it's not just the "radical right"
  
  Reply View | 4 replies

dgfitz 4 days ago

s/latent reasoning/next token prediction with guardrails

Reply View 1 reply

DoctorOetker 3 days ago

thats not a general substitution since you omit the latent qualifier.
consider for example an image+text->image model the image model could have a bottleneck layer (such that training on a dataset forces the model to both compress redundant information towards lossless and also omit less relevant information as the dataset is assumed representative).
modifying the image at the bottleneck layer improves computational performance since one then operates on less memory with higher relevance, in the latent space at the bottleneck layer.
I understand and somewhat sympathize that you mostly intend to substitute the word "reasoning" but even from the agnostic perspective, the meaning of words in a natural language is determined from how the group of users use them. I don't see you complain about overloading meanings for 99.99% of other words in our dictionaries, open any and you'll see many.
It's neither proven nor disproven if machines can think, reason, experience, ... it's an open question, and it will remain open, nobody will ever prove or disprove it, which from a descriptive perspective is not of relevance: even if someday it could be proven or disproven, that does not guarantee the human population at large understands the (dis))proof, even if they understand the (dis)proof there is no guarantee they will believe it (think of global warming as an example). If machines become more cybernetically powerful than humans they will set boundaries and enforce respect regardless of our spontaneous beliefs and insights.
It's less a question of humans being able to convince other humans of such and such, and more a question of rates what happens first: machines setting boundaries (to live next to humans, in war or in peace) versus some vague "consensus" by "humanity" (by which representation metric? the beliefs of tech leaders? of the media owners? of politicians?).

Reply View | 0 replies