Comment by lukev

Comment by lukev 4 days ago

16 replies

I am in full agreement that LLMs themselves seem to be beginning to level out. Their capabilities do indeed appear to be following a sigmoid curve rather than an exponential one, which is entirely unsurprising.

That doesn't mean there's not a lot of juice left to squeeze out of what's available now. Not just from RAG and agent systems, but also integrating neuro-symbolic techniques.

We can do this already just with prompt manipulation and integration with symbolic compute systems: I gave a talk on this at Clojure Conj just the other week (https://youtu.be/OxzUjpihIH4, apologies for the self promotion but I do think it's relevant.).

And that's just using existing LLMs. If we start researching and training them specifically for compatibility with neuro-symbolic data (e.g, directly tokenizing and embedding ontologies and knowledge graphs), it could unlock a tremendous amount of capability.

joe_the_user 4 days ago

Even more, each earlier explosion of AI optimism involved tech that barely panned-out at all. For investors, something that's yielded things of significant utility, is yielding more and promises the potential of far more if X or Y hurdle is cleared, is a pretty appealing thing.

I respect Marcus' analysis of the technology. But a lot of AI commentators have become habituated to shouting "AI winter" every time the tech doesn't live up to promises. Now that some substance is clearly present in AI, I can't imagine people stop trying to get a further payoff for the foreseeable future.

  • cratermoon 4 days ago

    > For investors, something that's yielded things of significant utility

    what exactly have investors gotten in return for their investment?

    • PopePompus 4 days ago

      A product which will significantly improve the productivity of programmers, if nothing else. That may not be a good return on investment, but I think it is undeniable that recent AI advances have nonzero value for coding.

      • thefz 4 days ago

        Those who think AI will make them better peogrammers make me think about the kind of day to day job they have. If a prompt is going to solve your problem, are you anything more than an entry level programmer? AI will not think for you and it's clear it is garbage against complexity.

      • la64710 4 days ago

        Then who cares ? If AI gets to do all the cool things and I am left to wash dishes and do my laundry. F** AI.

YetAnotherNick 4 days ago

I tracked ELO rating in Chatbot Arena for GPT-4/o series models over around 1.5 years(which are almost always highest rated), and at least on this metric it not only seems to be not stagnated, but also growth seems to be increasing[1]

[1]: https://imgur.com/a/r5qgfQJ

  • zamadatix 4 days ago

    Something seems quite off with the metric. Why would 4o recently increase on itself at a rate ~17x faster than 4o increased on 4 in that graph? E.g. ELO is a competitive metric, not an absolute metric, so someone could post the same graph with the claim the cause was "many new LLMs are being added to the system are not performing better than previous large models like they used to" (not saying it is or isn't, just saying the graph itself doesn't give context that LLMs are actually advancing at different rates or not).

    • YetAnotherNick 3 days ago

      Chatbot arena also has H2H win rate for each pair of models for non tied results[1], so as to detect the global drift. e.g the gpt-4o released on 2024/09/03 wins 69% of the times with respect to gpt-4o released on 2024/05/13 in blind test.

      [1]: https://lmarena.ai/