Comment by sigmoid10

Comment by sigmoid10 5 months ago

>It's extremely difficult to increase compute by 100 times, but with sufficient investment in talent, achieving a 10x increase in compute is more feasible.

The article explains how in reality the opposite is true. Especially when you look at it long term. Compute power grows exponentially, humans do not.

llm_trw 5 months ago

If the bitter lesson were true we'd be getting sota results out of two layer neural networks using tanh as activation functions.

It's a lazy blog post that should be thrown out after a minute of thought by anyone in the field.

Reply View 4 replies

sigmoid10 5 months ago

That's not how the economics work. There has been a lot of research that showed how deeper nets are more efficient. So if you spend a ton of compute money on a model, you'll want the best output - even though you could just as well build something shallow that may well be state of the art for its depth, but can't hold up with the competition on real tasks.

Reply View | 3 replies
- llm_trw 5 months ago
  
  Which is my point.
  You need a ton of specialized knowledge to use compute effectively.
  If we had infinite memory and infinite compute we'd just throw every problem of length n to a tensor of size R^(n^n).
  The issue is that we don't have enough memory in the world to store that tensor for something as trivial as mnist (and won't until the 2100s). And as you can imagine the exponentiated exponential grows a bit faster than the exponential so we never will.
  
  Reply View | 2 replies
  
  sigmoid10 5 months ago
  
  Then how does this invalidate the bitter lesson? It's like you're saying if aerodynamics were true, we'd have planes flying like insects by now. But that's simply not how it works at large scales - in particular if you want to build something economical.
  
  Reply View | 1 reply
  
  llm_trw 5 months ago
  
  Because is the bitter lesson were true no one would be wasting their time with convolutions or attention blocks. You'd just replace them with the general tensor that allows every hyper relation possible between all points instead.
  
  Reply View | 0 replies

OtherShrezzing 5 months ago

Humans don't grow exponentially indefinitely. But there's only something in the order of 100k AI researchers employed in the big labs right now. Meanwhile, there's around 20mn software engineers globally, and around 200k math graduates per year.

The number of humans who could feasibly work on this problem is pretty high, and the labs could grow an order of magnitude, and still only be tapping into the top 1-2% of engineers & mathematicians. They could grow two orders of magnitude before they've absorbed all of the above-average engineers & mathematicians in the world.

Reply View 5 replies

sigmoid10 5 months ago

I'd actually say the market is stretched pretty thin by now. I've been an AI researcher for a decade and what passes as AI researcher or engineer these days is borderline worthless. You can get a lot of people who can use scripts and middleware like frontend lego sets to build things, but I'd say there are less than 1k people in the world right now who can actually meaningfully improve algorithmic design. There are a lot more people out there who do systems design and cloud ops, so only when you choose to go for scaling, you'll find a plentiful set of human brainpower.

Reply View | 4 replies
- llm_trw 5 months ago
  
  Do you know what places people who are interested in research congregate at? Every forum, meet up or journal gets overwhelmed by bullshit with a year of being good.
  
  Reply View | 3 replies
  
  sigmoid10 5 months ago
  
  Universities (at least certain ones) and startups (more in absolute terms than universities, but there's also a much bigger fraction of swindlers). Most blogs and forums are garbage. If you're not inside these ecosystems, try to find out who the smart/talented people are by reading influential papers. Then you can start following them on X, linkedin etc. and often you'll see what they're up to next. For example, there's a pretty clear research paper and hiring trail of certain people that eventually led to GPT-4, even though OpenAI never published anything on the architecture.
  
  Reply View | 2 replies

smy20011 5 months ago

Human do write code that scalable with compute.

The performance is always raw performance * software efficiency. You can use shitty software and waste all these FLOPs.

Reply View 0 replies

alecco 5 months ago

Algorithmic improvements in new fields are often bigger than hardware improvements.

Reply View 0 replies