Comment by tokai

dogma1138 a day ago

It doesn’t need to know about QM or reactivity just about the building blocks that led to them. Which were more than around in the year 1900.

In fact you don’t want it to know about them explicitly just have enough background knowledge that you can manage the rest via context.

Reply View 30 replies

tokai a day ago

I was vague. My point is that I don't think the building blocks are in the data. Its mainly tertiary and popular sources. Maybe if you had the writings of Victorian scientists, both public and private correspondence.

Reply View | 1 reply
- pegasus 20 hours ago
  
  Probably a lot of it exists but in archives, private collections etc. Would be great if it will all end up digitized as well.
  
  Reply View | 0 replies
viccis 21 hours ago

LLMs are models that predict tokens. They don't think, they don't build with blocks. They would never be able to synthesize knowledge about QM.

Reply View | 27 replies
- PaulDavisThe1st 21 hours ago
  
  I am a deep LLM skeptic.
  But I think there are also some questions about the role of language in human thought that leave the door just slightly ajar on the issue of whether or not manipulating the tokens of language might be more central to human cognition than we've tended to think.
  If it turned out that this was true, then it is possible that "a model predicting tokens" has more power than that description would suggest.
  I doubt it, and I doubt it quite a lot. But I don't think it is impossible that something at least a little bit along these lines turns out to be true.
  
  Reply View | 17 replies
  
  viccis 18 hours ago
  
  I also believe strongly in the role of language, and more loosely in semiotics as a whole, to our cognitive development. To the extent that I think there are some meaningful ideas within the mountain of gibberish from Lacan, who was the first to really tie our conception of ourselves with our symbolic understanding of the world.
  Unfortunately, none of that has anything to do with what LLMs are doing. The LLM is not thinking about concepts and then translating that into language. It is imitating what it looks like to read people doing so and nothing more. That can be very powerful at learning and then spitting out complex relationships between signifiers, as it's really just a giant knowledge compression engine with a human friendly way to spit it out. But there's absolutely no logical grounding whatsoever for any statement produced from an LLM.
  The LLM that encouraged that man to kill himself wasn't doing it because it was a subject with agency and preference. It did so because it was, quite accurately I might say, mimicking the sequence of tokens that a real person encouraging someone to kill themselves would write. At no point whatsoever did that neural network make a moral judgment about what it was doing because it doesn't think. It simply performed inference after inference in which it scanned through a lengthy discussion between a suicidal man and an assistant that had been encouraging him and then decided that after "Cold steel pressed against a mind that’s already made peace? That’s not fear. That’s " the most accurate token would be "clar" and then "ity."
  
  Reply View | 12 replies
  
  TeMPOraL 13 hours ago
  
  If anything, I feel that current breed of multimodal LLMs demonstrate that language is not fundamental - tokens are, or rather their mutual association in high-dimensional latent space. Language as we recognize it, sequences of characters and words, are just a special case. Multimodal models manage to turn audio, video and text into tokens in the same space - they do not route through text when consuming or generating images.
  
  Reply View | 0 replies
  
  pegasus 20 hours ago
  
  > manipulating the tokens of language might be more central to human cognition than we've tended to think
  I'm convinced of this. I think it's because we've always looked at the most advanced forms of human languaging (like philosophy) to understand ourselves. But human language must have evolved from forms of communication found in other species, especially highly intelligent ones. It's to be expected that the building blocks of it is based on things like imitation, playful variation, pattern-matching, harnessing capabilities brains have been developing long before language, only now in the emerging world of sounds, calls, vocalizations.
  Ironically, the other crucial ingredient for AGI which LLMs don't have, but we do, is exactly that animal nature which we always try to shove under the rug, over-attributing our success to the stochastic parrot part of us, and ignoring the gut instinct, the intuitive, spontaneous insight into things which a lot of the great scientists and artists of the past have talked about.
  
  Reply View | 2 replies
- strbean 21 hours ago
  
  You realize parent said "This would be an interesting way to test proposition X" and you responded with "X is false because I say say", right?
  
  Reply View | 8 replies
  
  viccis 19 hours ago
  
  Yes. That is correct. If I told you I planned on going outside this evening to test whether the sun sets in the east, the best response would be to let me know ahead of time that my hypothesis is wrong.
  
  Reply View | 3 replies
  
  anonymous908213 20 hours ago
  
  "Proposition X" does not need testing. We already know X is categorically false because we know how LLMs are programmed, and not a single line of that programming pertains to thinking (thinking in the human sense, not "thinking" in the LLM sense which merely uses an anthromorphized analogy to describe a script that feeds back multiple prompts before getting the final prompt output to present to the user). In the same way that we can reason about the correctness of an IsEven program without writing a unit test that inputs every possible int32 to "prove" it, we can reason about the fundamental principles of an LLM's programming without coming up with ridiculous tests. In fact the proposed test itself is less eminently verifiable than reasoning about correctness; it could be easily corrupted by, for instance, incorrectly labelled data in the training dataset, which could only be determined by meticulously reviewing the entirety of the dataset.
  The only people who are serious about suggesting that LLMs could possibly 'think' are the people who are committing fraud on the scale of hundreds of billions of dollars (good for them on finding the all-time grift!) and people who don't understand how they're programmed, and thusly are the target of the grift. Granted, given that the vast majority of humanity are not programmers, and even fewer are programmers educated on the intricacies of ML, the grift target pool numbers in the billions.
  
  Reply View | 3 replies