Comment by dogma1138

Comment by dogma1138 a day ago

151 replies

Would be interesting to train a cutting edge model with a cut off date of say 1900 and then prompt it about QM and relativity with some added context.

If the model comes up with anything even remotely correct it would be quite a strong evidence that LLMs are a path to something bigger if not then I think it is time to go back to the drawing board.

bazzargh a day ago

You would find things in there that were already close to QM and relativity. The Michelson-Morley experiment was 1887 and Lorentz transformations came along in 1889. The photoelectric effect (which Einstein explained in terms of photons in 1905) was also discovered in 1887. William Clifford (who _died_ in 1889) had notions that foreshadowed general relativity: "Riemann, and more specifically Clifford, conjectured that forces and matter might be local irregularities in the curvature of space, and in this they were strikingly prophetic, though for their pains they were dismissed at the time as visionaries." - Banesh Hoffmann (1973)

Things don't happen all of a sudden, and being able to see all the scientific papers of the era its possible those could have fallen out of the synthesis.

  • matthewh806 a day ago

    I presume that's what the parent post is trying to get at? Seeing if, given the cutting edge scientific knowledge of the day, the LLM is able to synthesis all it into a workable theory of QM by making the necessary connections and (quantum...) leaps

    Standing on the shoulders of giants, as it were

    • palmotea 21 hours ago

      But that's not the OP's challenge, he said "if the model comes up with anything even remotely correct." The point is there were things already "remotely correct" out there in 1900. If the LLM finds them, it wouldn't "be quite a strong evidence that LLMs are a path to something bigger."

      • pegasus 20 hours ago

        It's not the comment which is illogical, it's your (mis)interpretation of it. What I (and seemingly others) took it to mean is basically could an LLM do Einstein's job? Could it weave together all those loose threads into a coherent new way of understanding the physical world? If so, AGI can't be far behind.

    • golem14 10 hours ago

      I think it's not productive to just have the LLM site like Mycroft in his armchair and from there, return you an excellent expert opinion.

      THat's not how science works.

      The LLM would have to propose experiments (which would have to be simulated), and then develop its theories from that.

      Maybe there had been enough facts around to suggest a number of hypotheses, but the LLM in its curent form won't be able to confirm them.

    • actionfromafar a day ago

      Yeah but... we still might not know if it could do that because we were really close by 1900 or if the LLM is very smart.

      • scottlamb 21 hours ago

        What's the bar here? Does anyone say "we don't know if Einstein could do this because we were really close or because he was really smart?"

        I by no means believe LLMs are general intelligence, and I've seen them produce a lot of garbage, but if they could produce these revolutionary theories from only <= year 1900 information and a prompt that is not ridiculously leading, that would be a really compelling demonstration of their power.

      • sleet_spotter 21 hours ago

        Well, if one had enough time and resources, this would make for an interesting metric. Could it figure it out with cut-off of 1900? If so, what about 1899? 1898? What context from the marginal year was key to the change in outcome?

  • somenameforme 11 hours ago

    It's only easy to see precursors in hindsight. The Michelson-Morley tale is a great example of this. In hindsight, their experiment was screaming relativity, because it demonstrated that the speed of light was identical from two perspectives where it's very difficult to explain without relativity. Lorentz contraction was just a completely ad-hoc proposal to maintain the assumptions of the time (luminiferous aether in particular) while also explaining the result. But in general it was not seen as that big of a deal.

    There's a very similar parallel with dark matter in modern times. We certainly have endless hints to the truth that will be evident in hindsight, but for now? We are mostly convinced that we know the truth, perform experiments to prove that, find nothing, shrug, adjust the model to be even more esoteric, and repeat onto the next one. And maybe one will eventually show something, or maybe we're on the wrong path altogether. This quote, from Michelson in 1894 (more than a decade before Einstein would come along), is extremely telling of the opinion at the time:

    "While it is never safe to affirm that the future of Physical Science has no marvels in store even more astonishing than those of the past, it seems probable that most of the grand underlying principles have been firmly established and that further advances are to be sought chiefly in the rigorous application of these principles to all the phenomena which come under our notice. It is here that the science of measurement shows its importance — where quantitative work is more to be desired than qualitative work. An eminent physicist remarked that the future truths of physical science are to be looked for in the sixth place of decimals." - Michelson 1894

    • vasco 8 hours ago

      With the passage of time more and more things have been discovered through precision. Through identifying small errors in some measurement and pursuing that to find the cause.

      • somenameforme 8 hours ago

        It's not precision that's the problem, but understanding when something has been falsified. For instance the Lorentz transformations work as a perfectly fine ad-hoc solution to Michelson's discovery. All it did was make the aether a bit more esoteric in nature. Why do you then not simply shrug, accept it, and move on? Perhaps even toss some accolades towards Lorentz for 'solving' the puzzle? Michelson himself certainly felt there was no particularly relevant mystery outstanding.

        For another parallel our understanding of the big bang was, and probably is, wrong. There are a lot of problems with the traditional view of the big bang with the horizon problem [1] being just one among many - areas in space that should not have had time to interact behave like they have. So this was 'solved' by an ad hoc solution - just make the expansion of the universe go into super-light speed for a fraction of a second at a specific moment, slow down, then start speeding up again (cosmic inflation [2]) - and it all works just fine. So you know what we did? Shrugged, accepted it, and even gave Guth et al a bunch of accolades for 'solving' the puzzle.

        This is the problem - arguably the most important principle of science is falsifiability. But when is something falsified? Because in many situations, probably the overwhelming majority, you can instead just use one falsification to create a new hypothesis with that nuance integrated into it. And as science moves beyond singular formulas derived from clear principles or laws and onto broad encompassing models based on correlations from limited observations, this becomes more and more true.

        [1] - https://en.wikipedia.org/wiki/Horizon_problem

        [2] - https://en.wikipedia.org/wiki/Cosmic_inflation

  • bhaak a day ago

    This would still be valuable even if the LLM only finds out about things that are already in the air.

    It’s probably even more of a problem that different areas of scientific development don’t know about each other. LLMs combining results would still not be like they invented something new.

    But if they could give us a head start of 20 years on certain developments this would be an awesome result.

  • Shorel 19 hours ago

    Then that experiment is even more interesting, and should be done.

    My own prediction is that the LLMs would totally fail at connecting the dots, but a small group of very smart humans can.

    Things don't happen all of a sudden, but they also don't happen everywhere. Most people in most parts of the world would never connect the dots. Scientific curiosity is something valuable and fragile, that we just take for granted.

    • bigfudge 17 hours ago

      One of the reasons they don’t happen everywhere is because there are just a few places at any given point in time where there are enough well connected and educated individuals who are in a position to even see all the dots let alone connect them. This doesn’t discount the achievement of an LLM also manages to, but I think it’s important to recognise that having enough giants in sight is an important prerequisite to standing on their shoulders

  • mannykannot 13 hours ago

    If (as you seem to be suggesting) relativity was effectively lying there on the table waiting for Einstein to just pick it up, how come it blindsided most, if not quite all, of the greatest minds of his generation?

    • TeMPOraL 13 hours ago

      That's the case with all scientific discoveries - pieces of prior work get accumulated, until it eventually becomes obvious[0] how they connect, at which point someone[1] connects the dots, making a discovery... and putting it on the table, for the cycle to repeat anew. This is, in a nutshell, the history of all scientific and technological progress. Accumulation of tiny increments.

      --

      [0] - To people who happen to have the right background and skill set, and are in the right place.

      [1] - Almost always multiple someones, independently, within short time of each other. People usually remember only one or two because, for better or worse, history is much like patent law: first to file wins.

  • djwide 14 hours ago

    With LLMs the synthesis cycles could happen at a much higher frequency. Decades condensed to weeks or days?

    I imagine possible buffers on that conjecture synthesis being epxerimentation and acceptance by the scientific community. AIs can come up with new ideas every day but Nature won't publish those ideas for years.

  • dogma1138 6 hours ago

    That is the point.

    New discoveries don’t happen in a vacuum.

    • eru 5 hours ago

      You can get pretty far by modeling only frictionless, spherical discoveries in a vacuum.

  • gus_massa 20 hours ago

    I agree, but it's important to note that QM has no clear formulation until 2025/6, it's like 20 years more of work than SR.

  • jojobas 8 hours ago

    They were close, but it required the best people bashing their heads against each other for years until they got it.

wongarsu 20 hours ago

I'm trying to work towards that goal by training a model on mostly German science texts up to 1904 (before the world wars German was the lingua franca of most sciences).

Training data for a base model isn't that hard to come by, even though you have to OCR most of it yourself because the publicly available OCRed versions are commonly unusably bad. But training a model large enough to be useful is a major issue. Training a 700M parameter model at home is very doable (and is what this TimeCapsuleLLM is), but to get that kind of reasoning you need something closer to a 70B model. Also a lot of the "smarts" of a model gets injected in fine tuning and RL, but any of the available fine tuning datasets would obviously contaminate the model with 2026 knowledge.

  • benbreen 18 hours ago

    I am a historian and am putting together a grant application for a somewhat similar project (different era and language though). Would you be open to discussing a collaboration? My email is bebreen [at] ucsc [dot] edu.

  • theallan 19 hours ago

    Can we follow along with your work / results somewhere?

  • [removed] 20 hours ago
    [deleted]
swalsh an hour ago

Could be an interesting experiment, but its not conclusive proof one way or another. So much of what makes LLMs so great today (vs gpt 3.5) would not be in that dataset. The training to turn these models into coding savants has generalized to other areas just as one example.

redman25 an hour ago

It's a base model. It hasn't been instruction tuned to "solve problems" necessarily. All it can do is attempt to complete text given some starting text.

catlifeonmars 8 hours ago

That’s how p-hacking works (or doesn’t work). This is analogous to shooting an arrow and then drawing a target around where it lands.

  • cornholio an hour ago

    Yes, I don't understand how such an experiment could work. You either:

    A). contaminate the model with your own knowledge of relativity, leading it on to "discover" what you know, or

    B). you will try to simulate a blind operation but without the "competent human physicist knowledgeable up to the the 1900 scientific frontier" component prompting the LLM, because no such person is alive today nor can you simulate them (if you could, then by definition you can use that simulated Einstein to discover relativity, so the problem is moot).

    So in both cases you would prove nothing about what a smart and knowledgeable scientist can achieve today from a frontier LLM.

  • alkindiffie 7 hours ago

    I like that analogy. It reminds me of "Pointing to the moon and looking at my finger"

DevX101 a day ago

Chemistry would be a great space to explore. The last quarter of the 19th century had a ton of advancements in chemistry. It'd be interesting the see if an LLM could propose fruitful hypotheses, made predictions of the science of thermodynamics.

kristopolous 18 hours ago

It's going to be divining tea leaves. It will be 99% wrong and then someone will say 'oh but look at this tea leaf over here! It's almost correct"'

  • darkwater 6 hours ago

    Yes but... aren't human researchers doing the same? They are mostly wrong most of the times, and try again, and verify again their work, until they find something that actually works. What I mean is that this "in hindsight" test would be biased by being in hindsight, because we know already the answer so we would discard the LLM answer as just randomly generated. But "connecting the dots" is basically doing a lot try and error in your mind, emitting only the results that make at least some kind of sense to us.

  • bowmessage 17 hours ago

    Look! It made another TODO-list app on the first try!

jaydepun 11 hours ago

We've thought of doing this sort of exercise at work but mostly hit the wall of data becoming a lot more scare the further back in time we go. Particularly high quality science data - even going pre 1970 (and that's already a stretch) you lose a lot of information. There's a triple whammy of data still existing, being accessible in any format, and that format being suitable for training an LLM. Then there's the complications of wanting additional model capabilities that won't leak data causally.

  • permo-w 4 hours ago

    I was wondering this. what is the minimum amount of text an LLM needs to be coherent? fun of an idea as this is, the samples of its responses are basically babbling nonsense. going further, a lot of what makes LLMs so strong isn't their original training data, but the RLHF done afterwards. RLHF would be very difficult in this case

amypetrik214 12 hours ago

>.If the model comes up with anything even remotely correct it would be quite a strong evidence that LLMs are a path to something bigger if not then I think it is time to go back to the drawing board.

In principle I see your point, in practice my default assumption until proven otherwise here -- is that a little something slipped through post-1900.

A much easier approach would be to just download some model, whatever model, today. Then 5 years from now, whatever interesting discoveries are found - can the model get there.

  • dogma1138 6 hours ago

    Not really, QM and Relativity were chosen because they were theories that were created to fit observations and data. Discoveries over the next 5 years will be trivia rather than logical conclusions.

isolli 3 hours ago

You have to make sure that you make it read an article about a painter falling off a roof with his tools.

bravura 21 hours ago

A rigorous approach to predicting the future of text was proposed by Li et al 2024, "Evaluating Large Language Models for Generalization and Robustness via Data Compression" (https://ar5iv.labs.arxiv.org/html//2402.00861) and I think that work should get more recognition.

They measure compression (perplexity) on future Wikipedia, news articles, code, arXiv papers, and multi-modal data. Data compression is intimately connected with robustness and generalization.

  • Otterly99 5 hours ago

    Thanks for the paper, I just read it and loved the approach. I hope the concept of using data compression as a benchmark will take off. In a sense it is kind of similar to the maxim "If you cannot explain something in simple terms, you do not understand it fully".

mannykannot 13 hours ago

That is a very interesting idea, though I would not dismiss LLMs as a dead end if they failed.

samuelson 20 hours ago

I think it would be fun to see if an LLM would reframe some scientific terms from the time in a way that would actually fit in our current theories.

I imagine if you explained quantum field theory to a 19th century scientists they might think of it as a more refined understanding of luminiferous aether.

Or if an 18th century scholar learned about positive and negative ions, it could be seen as an expansion/correction of phlogiston theory.

staticman2 11 hours ago

Don't you need to do reinforcement learning through human feedback to get non gibberish results from the models in general?

1900 era humans are not available to do this so I'm not sure how this experiment is supposed to work.

tokai a day ago

Looking at the training data I don't think it will know anything.[0] Doubt On the Connexion of the Physical Sciences (1834) is going to have much about QM. While the cut-off is 1900, it seems much of the texts a much closer to 1800 than 1900.

[0] https://github.com/haykgrigo3/TimeCapsuleLLM/blob/main/Copy%...

  • dogma1138 a day ago

    It doesn’t need to know about QM or reactivity just about the building blocks that led to them. Which were more than around in the year 1900.

    In fact you don’t want it to know about them explicitly just have enough background knowledge that you can manage the rest via context.

    • tokai a day ago

      I was vague. My point is that I don't think the building blocks are in the data. Its mainly tertiary and popular sources. Maybe if you had the writings of Victorian scientists, both public and private correspondence.

      • pegasus 20 hours ago

        Probably a lot of it exists but in archives, private collections etc. Would be great if it will all end up digitized as well.

    • viccis 21 hours ago

      LLMs are models that predict tokens. They don't think, they don't build with blocks. They would never be able to synthesize knowledge about QM.

      • PaulDavisThe1st 21 hours ago

        I am a deep LLM skeptic.

        But I think there are also some questions about the role of language in human thought that leave the door just slightly ajar on the issue of whether or not manipulating the tokens of language might be more central to human cognition than we've tended to think.

        If it turned out that this was true, then it is possible that "a model predicting tokens" has more power than that description would suggest.

        I doubt it, and I doubt it quite a lot. But I don't think it is impossible that something at least a little bit along these lines turns out to be true.

      • strbean 21 hours ago

        You realize parent said "This would be an interesting way to test proposition X" and you responded with "X is false because I say say", right?

alkindiffie 7 hours ago

That would be possible if LLMs can come up with entirely new words and languages, which I doubt.

nickdothutton 20 hours ago

I would love to ask such a model to summarise the handful of theories or theoretical “roads” being eyed at the time and to make a prediction with reasons as to which looks most promising. We might learn something about blind spots in human reasoning, institutions, and organisations that are applicable today in the “future”.

root_axis 19 hours ago

I think it would raise some interesting questions, but if it did yield anything noteworthy, the biggest question would be why that LLM is capable of pioneering scientific advancements and none of the modern ones are.

  • crazylogger 10 hours ago

    Or maybe, LLMs are pioneering scientific advancements - people are using LLMs to read papers, choose what problems to work on, come up with experiments, analyze results, and draft papers, etc., at this very moment. Except they eventually stick their human names on the cover so we almost never know.

imjonse a day ago

I suppose the vast majority of training data used for cutting edge models was created after 1900.

  • dogma1138 a day ago

    Ofc they are because their primary goal is to be useful and to be useful they need to always be relevant.

    But considering that Special Relativity was published in 1905 which means all its building blocks were already floating in the ether by 1900 it would be a very interesting experiment to train something on Claude/Gemini scale and then say give in the field equations and ask it to build a theory around them.

    • famouswaffles a day ago

      His point is that we can't train a Gemini 3/Claude 4.5 etc model because we don't have the data to match the training scale of those models. There aren't trillions of tokens of digitized pre-1900s text.

    • p1esk a day ago

      How can you train a Claude/Gemini scale model if you’re limited to <10% of the training data?

  • kopollo a day ago

    I don't know if this is related to the topic, but GPT5 can convert an 1880 Ottoman archival photograph to English, and without any loss of quality.

    • ddxv 14 hours ago

      My friend works in that period of Ottoman archives. Do you have a source or something I can share?

defgeneric 19 hours ago

The development of QM was so closely connected to experiments that it's highly unlikely, even despite some of the experiments having been performed prior to 1900.

Special relativity however seems possible.

Affric 13 hours ago

Wow, an actual scientific experiment. Does anyone with expertise know if such things have been done?

metalliqaz a day ago

Yann LeCun spoke explicitly on this idea recently and he asserts definitively that the LLM would not be able to add anything useful in that scenario. My understanding is that other AI researchers generally agree with him, and that it's mostly the hype beasts like Altman that think there is some "magic" in the weights that is actually intelligent. Their payday depends on it, so it is understandable. My opinion is that LeCun is probably correct.

  • johnsmith1840 a day ago

    There is some ability for it to make novel connections but it's pretty small. You can see this yourself having it build novel systems.

    It largely cannot imaginr anything beyond the usual but there is a small part that it can. This is similar to in context learning, it's weak but it is there.

    It would be incredible if meta learning/continual learning found a way to train exactly for novel learning path. But that's literally AGI so maybe 20yrs from now? Or never..

    You can see this on CL benchmarks. There is SOME signal but it's crazy low. When I was traing CL models i found that signal was in the single % points. Some could easily argue it was zero but I really do believe there is a very small amount in there.

    This is also why any novel work or findings is done via MASSIVE compute budgets. They find RL enviroments that can extract that small amount out. Is it random chance? Maybe, hard to say.

    • SoftTalker 17 hours ago

      Is this so different from what we see in humans? Most people do not think very creatively. They apply what they know in situations they are familiar with. In unfamiliar situations they don't know what to do and often fail to come up with novel solutions. Or maybe in areas where they are very experienced they will come up with something incrementally better than before. But occasionally a very exceptional person makes a profound connection or leap to a new understanding.

      • johnsmith1840 17 hours ago

        Sure we make small steps at the time but we compound these unlike AI.

        AI cannot compound their learnings for the foreseeable future

  • matheusd 20 hours ago

    How about this for an evaluation: Have this (trained-on-older-corpus) LLM propose experiments. We "play the role of nature" and inform it of the results of the experiments. It can then try to deduce the natural laws.

    If we did this (to a good enough level of detail), would it be able to derive relativity? How large of an AI model would it have to be to successfully derive relativity (if it only had access to everything published up to 1904)?

    • SirHumphrey 5 hours ago

      I don't know if any dataset of pre 1904 writing would be large enough to train a model that would be smart enough. I suspect that current sized SOTA models would at least get to special relativity, but for general relativity and quantum mechanics I am less sure.

  • samuelson 20 hours ago

    Preface: Most of my understand of how LLMs actually work comes from 3blue1brown's videos, so I could easily be wrong here.

    I mostly agree with you, especially about distrusting the self-interested hype beasts.

    While I don't think the models are actually "intelligent", I also wonder if there are insights to be gained by looking at how concepts get encoded by the models. It's not really that the models will add something "new", but more that there might be connections between things that we haven't noticed, especially because academic disciplines are so insular these days.

  • mlinksva 20 hours ago

    Do you have a pointer to where LeCun spoke about it? I noticed last October that Dwarkesh mentioned the idea off handedly on his podcast (prompting me to write up https://manifold.markets/MikeLinksvayer/llm-trained-on-data-...) but I wonder if this idea has been around for much longer, or is just so obvious that lots of people are independently coming up with it (parent to this comment being yet another)?

  • djwide 14 hours ago

    What do they (or you) have to say about the Lee Sedol AlphaGo move 78. It seems like that was "new knowledge." Are games just iterable and the real world idea space not? I am playing with these ideas a little.

    • metalliqaz 14 hours ago

      AlphaGo is not an LLM

      • drdeca 13 hours ago

        And? Do the arguments differ for LLM vs the other models?

        I guess the arguments sometimes mention languages. But I feel like the core of the arguments are pretty much the same regardless?

        • metalliqaz 2 hours ago

          The discussion is about training an LLM on old text and then asking it about new concepts.

  • catigula a day ago

    This is definitely wrong, most AI researchers DO NOT agree with LeCun.

    Most ML researchers think AGI is imminent.

    • kingstnap 21 hours ago

      Where do you get your majority from?

      I don't think there is any level of broad agreement right now. There are tons of random camps none of which I would consider to be broadly dominating.

    • rafram 20 hours ago

      The ones being paid a million dollars a year by OpenAI to say stuff like that, maybe.

    • johnsmith1840 20 hours ago

      The guy who built chatgpt literally said we're 20 years away?

      Not sure how to interpret that as almost imminent.

      • nottorp 19 hours ago

        > The guy who built chatgpt literally said we're 20 years away?

        20 years away in 2026, still 20 years away in 2027, etc etc.

        Whatever Altman's hyping, that's the translation.

    • goatlover 20 hours ago

      Do you have poll of ML researchers that shows this?

    • Alex2037 21 hours ago

      their employment and business opportunities depend on the hype, so they will continue to 'think' that (on xitter) despite the current SOTA of transformers-based models being <100% smarter than >3 year old GPT4, and no revolutionary new architecture in sight.

      • catigula 20 hours ago

        You're going to be in for a very rude awakening.

    • paodealho 20 hours ago

      Well, can you point us to their research then? Please.

a-dub a day ago

yeah i was just wondering that. i wonder how much stem material is in the training set...

  • signa11 a day ago

    i will go for ‘aint gonna happen for a 1000 dollars alex’

SecretDreams 14 hours ago

I like this idea. I think I'd like it more if we didn't have to prompt the LLM in the first place. If it just had all of this information and decided to act upon it. That's what the great minds of history (and even average minds like myself) do. Just think about the facts in our point of view and spontaneously reason something greater out of them.

damnitbuilds 18 hours ago

I like this, it would be exciting (and scary) if it deduced QM, and informative if it cannot.

But I also think we can do this with normal LLMs trained on up-to-date text, by asking them to come up with any novel theory that fits the facts. It does not have to be a groundbreaking theory like QM, just original and not (yet) proven wrong ?

nickpsecurity 20 hours ago

That would be an interesting experiment. It might be more useful to make a model with a cut off close to when copyrights expire to be as modern as possible.

Then, we have a model that knows quite a bit in modern English. We also legally have a data set for everything it knows. Then, there's all kinds of experimentation or copyright-safe training strategies we can do.

Project Gutenberg up to the 1920's seems to be the safest bet on that.

pseudohadamard 9 hours ago

It's already been done, without the model being aware of it, see https://arxiv.org/abs/2512.09742. They also made it think it was Hitler (not MechaHitler, the other guy), and other craziness.

It's a relief to think that we're not trusting these things for stuff like financial advice, medical advice, mental health counselling, ...