Comment by abraxas

Comment by abraxas 4 days ago

16 replies

I think LLM or no LLM the emergence of intelligence appears to be closely related to the number of synapses in a network whether a biological or a digital one. If my hypothesis is roughly true it means we are several orders of magnitude away from AGI. At least the kind of AGI that can be embodied in a fully functional robot with the sensory apparatus that rivals the human body. In order to build circuits of this density it's likely to take decades. Most probably transistor based, silicon based substrate can't be pushed that far.

joshjob42 4 days ago

I think generally the expectation is that there are around 100T synapses in the brain, and of course it's probably not a 1:1 correspondence with neural networks, but it doesn't seem infeasible at all to me that a dense-equivalent 100T parameter model would be able to rival the best humans if trained properly.

If basically a transformer, that means it needs at inference time ~200T flops per token. The paper assumes humans "think" at ~15 tokens/second which is about 10 words, similar to the reading speed of a college graduate. So that would be ~3 petaflops of compute per second.

Assuming that's fp8, an H100 could do ~4 petaflops, and the authors of AI 2027 guesstimate that purpose wafer scale inference chips circa late 2027 should be able to do ~400petaflops for inference, ~100 H100s worth, for ~$600k each for fabrication and installation into a datacenter.

Rounding that basically means ~$6k would buy you the compute to "think" at 10 words/second. Generally speaking that'd probably work out to maybe $3k/yr after depreciation and electricity costs, or ~30-50¢/hr of "human thought equivalent" 10 words/second. Running an AI at 50x human speed 24/7 would cost ~$23k/yr, so 1 OpenBrain researcher's salary could give them a team of ~10-20 such AIs running flat out all the time. Even if you think the AI would need an "extra" 10 or even 100x in terms of tokens/second to match humans, that still puts you at genius level AIs in principle runnable at human speed for 0.1 to 1x the median US income.

There's an open question whether training such a model is feasible in a few years, but the raw compute capability at the chip level to plausibly run a model that large at enormous speed at low cost is already existent (at the street price of B200's it'd cost ~$2-4/hr-human-equivalent).

  • brookst 3 days ago

    Excellent back of napkin math and it feels intuitively right.

    And I think training is similar — training is capital intensive therefore centralized, but if 100m people are paying $6k for their inference hardware, add on $100/year as a training tax (er, subscription) and you’ve got $10B/year for training operations.

ivraatiems 4 days ago

I think there is a good chance you are roughly right. I also think that the "secret sauce" of sapience is probably not something that can be replicated easily with the technology we have now, like LLMs. They're missing contextual awareness and processing which is absolutely necessary for real reasoning.

But even so, solving that problem feels much more attainable than it used to be.

  • throwup238 4 days ago

    I think the missing secret sauce is an equivalent to neuroplasticity. Human brains are constantly being rewired and optimized at every level: synapses and their channels undergo long term potentiation and depression, new connections are formed and useless ones pruned, and the whole system can sometimes remap functions to different parts of the brain when another suffers catastrophic damage. I don’t know enough about the matrix multiplication operations that power LLMs, but it’s hard to imagine how that kind of organic reorganization would be possible with GPUs matmul. It’d require some sort of advanced “self aware” profile guided optimization and not just trial and error noodling with Torch ops or CUDA kernels.

    I assume that thanks to the universal approximation theorem it’s theoretically possible to emulate the physical mechanism, but at what hardware and training cost? I’ve done back of the napkin math on this before [1] and the number of “parameters” in the brain is at least 2-4 orders of magnitude more than state of the art models. But that’s just the current weights, what about the history that actually enables the plasticity? Channel threshold potentials are also continuous rather than discreet and emulating them might require the full fp64 so I’m not sure how we’re even going to get to the memory requirements in the next decade, let alone whether any architecture on the horizon can emulate neuroplasticity.

    Then there’s the whole problem of a true physical feedback loop with which the AI can run experiments to learn against external reward functions and the core survival reward function at the core of evolution might itself be critical but that’s getting deep into the research and philosophy on the nature of intelligence.

    [1] https://news.ycombinator.com/item?id=40313672

    • lblume 3 days ago

      Transformers already are very flexible. We know that we can basically strip blocks at will, reorder modules, transform their input in predictable ways, obstruct some features and they will after a very short period of re-training get back to basically the same capabilities they had before. Fascinating stuff.

  • narenm16 4 days ago

    i agree. it feels like scaling up these large models is such an inefficient route that seems to be warranting new ideas (test-time compute, etc).

    we'll likely reach a point where it's infeasible for deep learning to completely encompass human-level reasoning, and we'll need neuroscience discoveries to continue progress. altman seems to be hyping up "bigger is better," not just for model parameters but openai's valuation.

baq 4 days ago

Exponential growth means the first order of magnitude comes slowly and the last one runs past you unexpectedly.

  • Palmik 4 days ago

    Exponential growth generally means that the time between each order of magnitude is roughly the same.

    • brookst 3 days ago

      At the risk of pedantry, is that true? Something that doubles annually sure seems like exponential growth to me, but the orders of magnitude are not at all the same rate. Orders of magnitude are a base-10 construct but IMO exponents don’t have to be 10.

      EDIT: holy crap I just discovered a commonly known thing about exponents and log. Leaving comment here but it is wrong, or at least naive.

UltraSane 4 days ago

Why can't the compute be remote from the robot? That is a major advantage of human technology over biology.

  • abraxas 4 days ago

    Mostly latency. But even if a single robot could be driven by a data centre consider the energy and hardware investment requirements to make such a creature practical.

    • Jensson 4 days ago

      1ms latency is more than fast enough, you probably have bigger latency than that between the cpu and the gpu.

      • Symmetry 3 days ago

        We've got 10ms of latency between our brains and our hands along our nerve fibers and we function all right.

    • UltraSane 3 days ago

      The Figure robots use a two level control scheme with a fast LLM at 200Hz directly controlling the robot and a slow planning LLM running at 7Hz. This planning LLM could be very far away indeed and still have less than 142.8ms of latency.

    • UltraSane 4 days ago

      Latency would be kept low be keeping the compute nearby. One 1U or 2U server per robot would be reasonable.