Comment by abraxas

Comment by abraxas 3 months ago

I think LLM or no LLM the emergence of intelligence appears to be closely related to the number of synapses in a network whether a biological or a digital one. If my hypothesis is roughly true it means we are several orders of magnitude away from AGI. At least the kind of AGI that can be embodied in a fully functional robot with the sensory apparatus that rivals the human body. In order to build circuits of this density it's likely to take decades. Most probably transistor based, silicon based substrate can't be pushed that far.

joshjob42 3 months ago

I think generally the expectation is that there are around 100T synapses in the brain, and of course it's probably not a 1:1 correspondence with neural networks, but it doesn't seem infeasible at all to me that a dense-equivalent 100T parameter model would be able to rival the best humans if trained properly.

If basically a transformer, that means it needs at inference time ~200T flops per token. The paper assumes humans "think" at ~15 tokens/second which is about 10 words, similar to the reading speed of a college graduate. So that would be ~3 petaflops of compute per second.

Assuming that's fp8, an H100 could do ~4 petaflops, and the authors of AI 2027 guesstimate that purpose wafer scale inference chips circa late 2027 should be able to do ~400petaflops for inference, ~100 H100s worth, for ~$600k each for fabrication and installation into a datacenter.

Rounding that basically means ~$6k would buy you the compute to "think" at 10 words/second. Generally speaking that'd probably work out to maybe $3k/yr after depreciation and electricity costs, or ~30-50¢/hr of "human thought equivalent" 10 words/second. Running an AI at 50x human speed 24/7 would cost ~$23k/yr, so 1 OpenBrain researcher's salary could give them a team of ~10-20 such AIs running flat out all the time. Even if you think the AI would need an "extra" 10 or even 100x in terms of tokens/second to match humans, that still puts you at genius level AIs in principle runnable at human speed for 0.1 to 1x the median US income.

There's an open question whether training such a model is feasible in a few years, but the raw compute capability at the chip level to plausibly run a model that large at enormous speed at low cost is already existent (at the street price of B200's it'd cost ~$2-4/hr-human-equivalent).

Reply View 1 reply

brookst 3 months ago

Excellent back of napkin math and it feels intuitively right.
And I think training is similar — training is capital intensive therefore centralized, but if 100m people are paying $6k for their inference hardware, add on $100/year as a training tax (er, subscription) and you’ve got $10B/year for training operations.

Reply View | 0 replies

nopinsight 3 months ago

If by “several” orders of magnitude, you mean 3-5, then we might be there by 2030 or earlier.

https://situational-awareness.ai/from-gpt-4-to-agi/

Reply View 0 replies

ivraatiems 3 months ago

I think there is a good chance you are roughly right. I also think that the "secret sauce" of sapience is probably not something that can be replicated easily with the technology we have now, like LLMs. They're missing contextual awareness and processing which is absolutely necessary for real reasoning.

But even so, solving that problem feels much more attainable than it used to be.

Reply View 3 replies

throwup238 3 months ago

I think the missing secret sauce is an equivalent to neuroplasticity. Human brains are constantly being rewired and optimized at every level: synapses and their channels undergo long term potentiation and depression, new connections are formed and useless ones pruned, and the whole system can sometimes remap functions to different parts of the brain when another suffers catastrophic damage. I don’t know enough about the matrix multiplication operations that power LLMs, but it’s hard to imagine how that kind of organic reorganization would be possible with GPUs matmul. It’d require some sort of advanced “self aware” profile guided optimization and not just trial and error noodling with Torch ops or CUDA kernels.
I assume that thanks to the universal approximation theorem it’s theoretically possible to emulate the physical mechanism, but at what hardware and training cost? I’ve done back of the napkin math on this before [1] and the number of “parameters” in the brain is at least 2-4 orders of magnitude more than state of the art models. But that’s just the current weights, what about the history that actually enables the plasticity? Channel threshold potentials are also continuous rather than discreet and emulating them might require the full fp64 so I’m not sure how we’re even going to get to the memory requirements in the next decade, let alone whether any architecture on the horizon can emulate neuroplasticity.
Then there’s the whole problem of a true physical feedback loop with which the AI can run experiments to learn against external reward functions and the core survival reward function at the core of evolution might itself be critical but that’s getting deep into the research and philosophy on the nature of intelligence.
[1] https://news.ycombinator.com/item?id=40313672

Reply View | 1 reply
- lblume 3 months ago
  
  Transformers already are very flexible. We know that we can basically strip blocks at will, reorder modules, transform their input in predictable ways, obstruct some features and they will after a very short period of re-training get back to basically the same capabilities they had before. Fascinating stuff.
  
  Reply View | 0 replies
narenm16 3 months ago

i agree. it feels like scaling up these large models is such an inefficient route that seems to be warranting new ideas (test-time compute, etc).
we'll likely reach a point where it's infeasible for deep learning to completely encompass human-level reasoning, and we'll need neuroscience discoveries to continue progress. altman seems to be hyping up "bigger is better," not just for model parameters but openai's valuation.

Reply View | 0 replies

baq 3 months ago

Exponential growth means the first order of magnitude comes slowly and the last one runs past you unexpectedly.

Reply View 2 replies

Palmik 3 months ago

Exponential growth generally means that the time between each order of magnitude is roughly the same.

Reply View | 1 reply
- brookst 3 months ago
  
  At the risk of pedantry, is that true? Something that doubles annually sure seems like exponential growth to me, but the orders of magnitude are not at all the same rate. Orders of magnitude are a base-10 construct but IMO exponents don’t have to be 10.
  EDIT: holy crap I just discovered a commonly known thing about exponents and log. Leaving comment here but it is wrong, or at least naive.
  
  Reply View | 0 replies

UltraSane 3 months ago

Why can't the compute be remote from the robot? That is a major advantage of human technology over biology.

Reply View 5 replies

abraxas 3 months ago

Mostly latency. But even if a single robot could be driven by a data centre consider the energy and hardware investment requirements to make such a creature practical.

Reply View | 4 replies
- Jensson 3 months ago
  
  1ms latency is more than fast enough, you probably have bigger latency than that between the cpu and the gpu.
  
  Reply View | 1 reply
  
  Symmetry 3 months ago
  
  We've got 10ms of latency between our brains and our hands along our nerve fibers and we function all right.
  
  Reply View | 0 replies
- UltraSane 3 months ago
  
  The Figure robots use a two level control scheme with a fast LLM at 200Hz directly controlling the robot and a slow planning LLM running at 7Hz. This planning LLM could be very far away indeed and still have less than 142.8ms of latency.
  
  Reply View | 0 replies
- UltraSane 3 months ago
  
  Latency would be kept low be keeping the compute nearby. One 1U or 2U server per robot would be reasonable.
  
  Reply View | 0 replies