Comment by hedgehog

Comment by hedgehog 4 days ago

14 replies

The context some commenters here seem to be missing is that Marcus is arguing that spending another $100B on pure scaling (more params, more data, more compute) is unlikely to repeat the qualitatively massive improvement we saw between say 2017 and 2022. We see some evidence this is true in the shift towards what I categorize as system integration approaches: RAG, step by step reasoning, function calling, "agents", etc. The theory and engineering is getting steadily better as evidenced by the rapidly improving capability of models down in the 1-10B param range but we don't see the same radical improvements out of ChatGPT etc.

int_19h 4 days ago

I don't see how that is evidence of the claim. We are doing all these things because they make existing models work better, but a larger model with RAG etc is still better than a small one, and everyone keeps working on larger models.

  • hedgehog 4 days ago

    There is a contingent that I think Marcus is responding to that have been claiming that all we need to get to AGI or ASI is pure transformer scaling, and that we were very close with only maybe $10B or $100B more investment to get there. If the last couple of years of research have given us only incrementally better models to the point that even the best funded teams are moving to hybrid approaches then that's evidence that Marcus is correct.

    • klipt 4 days ago

      This website by a former OpenAI employee was arguing that a combination of hardware scaling, algorithmic improvements, etc would all combine to yield AGI in the near future: https://situational-awareness.ai/

    • KaoruAoiShiho 4 days ago

      [flagged]

      • cratermoon 4 days ago

        > AI labs are pursuing huge compute ramp-ups to scale training

        Yeah, and many, not just Marcus, are doubtful that the huge ramp-ups and scale will yield proportional gains. If you have evidence otherwise, share it.

semicolon_storm 4 days ago

Perhaps because that's a strawman argument. "Scaling" doesn't mean double the investment and get double the performance. Even OpenAI's own scaling laws paper doesn't argue that, in the graphs compute increases exponentially. What LLM scaling means is that there hasn't been a wall found where the loss stops decreasing. Increase model size/data/compute and loss will decrease -- so far.

edanm 4 days ago

That's important context.

But in the article, Gary Marcus does what he normally does - make far broader statements than the narrow "LLM architecture by itself won't scale to AGI" or even "we will or even are reaching diminishing returns with LLMs". I don't think that's as controversial a take as he might imagine.

However, he's going from a purely technical guess, which might or might not be true, and then making fairly sweeping statements on business and economics, which might not be true even if he's 100% right about the scaling of LLMs.

He's also seemingly extremely dismissive of the current value of LLMs. E.g. this comment which he made previously and mentions that he stands by:

> If enthusiasm for GenAI dwindles and market valuations plummet, AI won’t disappear, and LLMs won’t disappear; they will still have their place as tools for statistical approximation.

Is there anyone who thinks "oh gee, LLMs have a place for statistical approximation"? That's an insanely irrelevant way to describe LLMs, and given the enormous value that existing LLM systems have already created, talking about "LLMs won't disappear, they'll still have a place" just sounds insane.

It shouldn't be hard to keep two separate thoughts in mind:

1. LLMs as they currently exist, without additional architectural changes/breakthroughs, will not, on their own, scale to AGI .

2. LLMs are already a massively useful technology that we are just starting to learn how to use and to derive business value from, and even without scaling to AGI, will become more and more prevalent.

I think those are two statements that most people should be able to agree with, probably even including most of the people Marcus is supposedly "arguing against", and yet from reading his posts it sounds like he completely dismisses point 2.

  • jpc0 4 days ago

    > 2. LLMs are already a massively useful technology that we are just starting to learn how to use and to derive business value from, and even without scaling to AGI, will become more and more prevalent.

    No offence but every use of AI I have tried has been amazing but I haven't been comfortable deploying as a business use. The one or two places it is "good enough" it is effectively just reducing workforce and that reduction isn't translating into lower costs or general uplift, it is currently translating into job losses and increased profit margins.

    I'm AI sceptical, I feel it is a tradeoff where quality of output is reduced but also is (currently) cheaper so businesses are willing to jump in.

    At what point does OpenAI/Claude/Gemini etc stop hyperscaling and start running a profit which will translate into higher costs. So then the current reduction in cost isn't there. We will be left holding the bag of higher unemployment and an inferior product that costs the same amount of money.

    There are large unanswered questions about AI which makes me entirely anti-AI. Sure the technology is amazing as it stands, but it is fundamentally a lossy abstraction over reality and many people will happily accept the lossy abstraction but not look forward into what happens when that is the only option you have and it's no cheaper than the less lossy option (humans).

    • munksbeer 2 days ago

      > The one or two places it is "good enough" it is effectively just reducing workforce and that reduction isn't translating into lower costs or general uplift, it is currently translating into job losses and increased profit margins

      What sort of examples show this?

      • jpc0 20 hours ago

        Image generation for product ads.

        And no need to tell me that's not happening, I have seen multiple examples this week for AI generated images with a product comped in.