Comment by refulgentis

Comment by refulgentis 3 days ago

Dwarkesh's blogging confuses me, because I am not sure if the message is free-associating, or, relaying information gathered.

ex. how this reads if it is free-associating: "shower thought: RL on LLMs is kinda just 'did it work or not?' and the answer is just 'yes or no', yes or no is a boolean, a boolean is 1 bit, then bring in information theory interpretation of that, therefore RL doesn't give nearly as much info as, like, a bunch of words in pretraining"

ex. how this reads if it is relaying information gathered: "A common problem across people at companies who speak honestly with me about the engineering side off the air is figuring out how to get more out of RL. The biggest wall currently is the cross product of RL training being slowww and lack of GPUs. More than one of them has shared with me that if you can crack the part where the model gets very little info out of one run, then the GPU problem goes away. You can't GPU your way out of how little info they get"

I am continuing to assume it is much more A than B, given your thorough sounding explanation and my prior that he's not shooting the shit about specific technical problems off-air with multiple grunts.

robrenaud 3 days ago

He is essentially expanding upon an idea made by Andrej Karpathy on his podcast about a month prior.

Karpathy says that basically "RL sucks" and that it's like "sucking bits of supervision through a straw".

https://x.com/dwarkesh_sp/status/1979259041013731752/mediaVi...

Reply View 0 replies

bugglebeetle 3 days ago

Dwarkesh has a CS degree, but zero academic training or real world experience in deep learning, so all of his blogging is just secondhand bullshitting to further siphon off a veneer of expertise from his podcast guests.

Reply View 4 replies

vessenes 3 days ago

So grumpy! Please pick up the torch and educate the world better; it can only help.

Reply View | 3 replies
- refulgentis 3 days ago
  
  Better to be honest than say nothing, plenty of people say nothing. I asked a polite question thats near-impossible to answer without that level of honesty.
  
  Reply View | 1 reply
  
  vessenes 2 days ago
  
  I thought your question was great. I read the Dwarkesh post as scratch space for working out his thinking - so, closer to a shower thought. But also, an attempt to do what he’s really great at, which is distill and summarize at a “random engineer” level of complexity.
  You can kind of hear him pull in these extremely differing views on the future from very different sources, try and synthesize them, and also come out with some of his own perspective this year - I think it’s interesting. At the very least, his perspective is hyper-informed - he’s got fairly high-trust access to a lot of decision makers and senior researchers - and he’s smart and curious.
  This year we’ve had him bring in the 2027 folks (AI explosion on schedule), Hinton (LLMs are literally divorced from reality, and a total dead-end), both Ilya (we probably need emotions for super intelligence, also I won’t tell you my plan), Karpathy and Dario (Dario maybe twice?), Gwen, all with very very different perspectives on what’s coming and why.
  So, I think if you read him as one of the chroniclers of this era his own take is super interesting, and he’s in a position to be of great use precisely at synthesizing and (maybe) predicting; he should keep it up.
  
  Reply View | 0 replies
- bugglebeetle 3 days ago
  
  I teach and mentor lots of folks in my world. What I don’t do is feign expertise to rub shoulders with the people doing the actual work so I can soak money from rubes with ad rolls.
  
  Reply View | 0 replies