Comment by an0malous

Comment by an0malous 11 hours ago

56 replies

> These are not going to be problems tomorrow because the technology will shift. As it happened many time in the span of the last 2 years.

What technology shifts have happened for LLMs in the last 2 years?

dcre 11 hours ago

One example is that there used to be a whole complex apparatus around getting models to do chain of thought reasoning, e.g., LangChain. Now that is built in as reasoning and they are heavily trained to do it. Same with structured outputs and tool calls — you used to have to do a bunch of stuff to get models to produce valid JSON in the shape you want, now it’s built in and again, they are specifically trained around it. It used to be you would have to go find all relevant context up front and give it to the model. Now agent loops can dynamically figure out what they need and make the tool calls to retrieve it. Etc etc.

  • mewpmewp2 8 hours ago

    LangChain generally felt pointless for me to use, not a good abstraction. It would rather keep you from the most important thing that you need in this fast evolving ecosystem, and it's direct prompt level (if you can even call that low level) understanding of what is going on.

postalcoder 11 hours ago

If we expand this to 3 years, the single biggest shift that totally changed LLM development is the increase in size of context windows from 4,000 to 16,000 to 128,000 to 256,000.

When we were at 4,000 and 16,000 context windows, a lot of effort was spent on nailing down text splitting, chunking, and reduction.

For all intents and purposes, the size of current context windows obviates all of that work.

What else changed?

- Multimodal LLMs - Text extraction from PDFs was a major issue for rag/document intelligence. A lot of time was wasted trying to figure out custom text extraction strategies for documents. Now, you can just feed the image of a PDF page into an LLM and get back a better transcription.

- Reduced emphasis on vector search. People have found that for most purposes, having an agent grep your documents is cheaper and better than using a more complex rag pipeline. Boris Cherny created a stir when he talked about claude code doing it that way[0]

https://news.ycombinator.com/item?id=43163011#43164253

throwaway13337 10 hours ago

I'm amazed at this question and the responses you're getting.

These last few years, I've noticed that the tone around AI on HN changes quite a bit by waking time zone.

EU waking hours have comments that seem disconnected from genAI. And, while the US hours show a lot of resistance, it's more fear than a feeling that the tools are worthless.

It's really puzzling to me. This is the first time I noticed such a disconnect in the community about what the reality of things are.

To answer your question personally, genAI has changed the way I code drastically about every 6 months in the last two years. The subtle capability differences change what sorts of problems I can offload. The tasks I can trust them with get larger and larger.

It started with better autocomplete, and now, well, agents are writing new features as I write this comment.

  • GoatInGrey 7 hours ago

    The main line of contention is how much autonomy these agents are capable of handling in a competitive environment. One side generally argues that they should be fully driven by humans (i.e. offloading tedious tasks you know the exact output of but want to save time not doing) while the other side generally argues that AI agents should handle tasks end-to-end with minimal oversight.

    Both sides have valid observations in their experiences and circumstances. And perhaps this is simply another engineering "it depends" phenomenon.

  • bdangubic 9 hours ago

    the disconnect is quite simple, there are people that are professionals and are willing to put the time in to learn and then there’s vast majority of others who don’t and will bitch and moan how it is shit etc. if you can’t get these tools to make your job easier and more productive you ought to be looking for a different career…

    • overfeed 8 hours ago

      You're not doing yourself any favors by labeling people who disagree with you undereducated or uninformed. There is enough over-hyped products/techniques/models/magical-thinking to warrant skepticism. At the root of this thread is an argument to (paraphrasing) encouraging people to just wait until someone solves major problems instead of tackling it themselves. This is a broad statement of faith, if I've ever seen one, in a very religious sense: "Worry not, the researchers and foundation models will provide."

      My skepticism and intuition that AI innovations are not exponential, but sigmoid are not because I don't understand what gradient-descent, transformers, RAG, CoT, or multi-head attention are. My statement of faith is: the ROI economics are going to catch up with the exuberance way before AGI/ASI is achieved; sure, you're getting improving agents for now, but that's not going to justify the 12- or 13-digit USD investments. The music will stop, and improvements slow to a drip

      Edit: I think at it's root, the argument is between folk who think AI will follow the same curve as past technological trends, and those who believe "It's different this time".

      • bdangubic 7 hours ago

        > labeling people who disagree with you undereducated or uninformed

        I did neither of these two things... :) I personally could not care about

        - (over)hype

        - 12/13/14/15 ... digit USD investment

        - exponential vs. sigmoid

        There are basically two groups of industry folk:

        1. those that see technology as absolutely transformational and are already doing amazeballs shit with it

        2. those that argue how it is bad/not-exponential/ROI/...

        If I was a professional (I am) I would do everything in my power to learn everything there is to learn (and then more) and join the Group #1. But it is easier to be in Group #2 as being in Group #1 requires time and effort and frustrations and throwing laptop out the window and ... :)

      • juped 5 hours ago

        They're not logistic, this is a species of nonsense claim that irks me even more than claiming "capabilities gains are exponential, singularity 2026!"; it actually includes the exponential-gains claim and then tries to tack on epicycles to preempt the lack of singularities.

        Remember, a logistic curve is an exponential (so, roughly, a process whose outputs feed its growth, the classic example being population growth, where more population makes more population) with a carrying capacity (the classic example is again population, where you need to eat to be able to reproduce).

        Singularity 2026 is open and honest, wearing its heart on its sleeve. It's a much more respectable wrong position.

    • siva7 9 hours ago

      It's disheartening. I got a colleague, very senior, who dislikes AI for a myriad of reasons and doesn't want to adapt if not forced by mgmt. I feel from 2022-2024 the majority of my colleagues were in this camp - either afraid from AI or because they looked at it as not something a "real" developer would ever use. 2025 it seemed to change a bit. American HN seemed to adapt more quickly while EU companies are still lacking the foresight to see what is happening on the grand scale.

  • [removed] 9 hours ago
    [deleted]
  • GiorgioG 10 hours ago

    Despite the latest and greatest models…I still see glaring logic errors in the code produced in anything beyond basic CRUD apps. They still make up fields that don’t exist, assign a value to a variable that is nonsensical. I’ll give you an example, in the code in question, Codex assigned a required field LoanAmount to a value from a variable called assessedFeeAmount…simply because as far as I can tell, it had no idea how to get the correct value from the current function/class.

    • lbreakjai 8 hours ago

      That's why I don't get people that claim to be letting an agent run for an hour on some task. LLMs tend to do so many small errors like that, that are so hard to catch if you aren't super careful.

      I wouldn't want to have to review the output of an agent going wild for an hour.

      • snoman 5 hours ago

        Who says anyone’s reviewing anything? I’m seeing more and more influencers and YouTubers playing engineer or just buying an app from an overseas app farm. Do you think anyone in that chain gives the first shit what the code is like?

        It’s the worst kind of disposable software.

  • nickphx 10 hours ago

    ai is useless. anyone claiming otherwise is dishonest

    • la_fayette 8 hours ago

      I use GenAI for text translation, text 2 voice and voice 2 text, there it is extremely useful. For coding I often have the feeling it is useless, but also sometimes it is useful, like most tools...

    • whattheheckheck 10 hours ago

      What are you doing at your job that ai can't help with at all to consider is completely use less?

    • ghurtado 10 hours ago

      That could even be argued (with an honest interlocutor, which you clearly are not)

      The usefulness of your comment, on the other hand, is beyond any discussion.

      "Anyone who disagrees with me is dishonest" is some kindergarten level logic.

    • ulfw 9 hours ago

      [Deleted as Hackernews is not for discussion of divergent opinions]

      • wiseowise 9 hours ago

        > It's not useless but it's not good for humanity as a whole.

        Ridiculous statement. Is Google also not good for humanity as a whole? Is Internet not good for humanity as a whole? Wikipedia?

  • the_mitsuhiko 10 hours ago

    > EU waking hours have comments that seem disconnected from genAI. And, while the US hours show a lot of resistance, it's more fear than a feeling that the tools are worthless.

    I don't think it's because the audience is different but because the moderators are asleep when Europeans are up. There are certain topics which don't really survive on the frontpage when moderators are active.

    • jagged-chisel 10 hours ago

      I'm unsure how you're using "moderators." We, the audience, are all 'moderators' if we have the karma. The operators of the site are pretty hands-off as far as content in general.

      This would mean it is because the audience is different.

      • the_mitsuhiko 9 hours ago

        I’m referring to the actual moderators of this website removing posts from the front page.

      • uoaei 10 hours ago

        The people who "operate" the website are different from the people who "moderate" the website but both are paid positions.

        This fru-fru about how "we all play a part" is only serving to obscure the reality.

    • jamesblonde 7 hours ago

      Anything sovereign AI or whatever is gone immediately when the mods wake up. Got an EU cloud article? Publish it at 11am CET, it's disappears around 12.30.

deepdarkforest 10 hours ago

On the foundational level, test time compute(reasoning), heavy RL post training, 1M+ plus context length etc.

On the application layer, connecting with sandboxes/VM's is one of the biggest shifts. (Cloudfares codemode etc). Giving an llm a sandbox unlocks on the fly computation, calculations, RPA, anything really.

MCP's, or rather standardized function calling is another one.

Also, local llm's are becoming almost viable because of better and better distillation, relying on quick web search for facts etc.

WA 11 hours ago

Not the LLMs. The APIs got more capabilities such as tool/function calling, explicit caching etc.

  • dcre 11 hours ago

    It is the LLMs because they have to be RLed to be good at these things.

echelon 11 hours ago

We started putting them in image and video models and now image and video models are insane.

I think the next period of high and rapid growth will be in media (image, video, sound, 3D), not text.

It's much harder to adapt LLMs to solving business use cases with text. Each problem is niche, you have to custom tailor the solution, and the tooling is crude.

The media use cases, by contrast, are low hanging fruit and result in 10,000x speedups and cost reductions almost immediately. The models are pure magic.

I think more companies would be wise to ignore text for now and focus on visual domain problems.

Nano Banana has so much more utility than agents. And there are so many low hanging fruit ways to make lots of money.

Don't sleep on image and video. That's where the growth salient is.

  • wild_egg 10 hours ago

    > Nano Banana has so much more utility than agents.

    I am so far removed from multimedia spaces that I truly can't imagine a universe where this could be true. Agents have done incredible things for me and Nano Banana has been a cool gimmick for making memes.

    Anyone have a use case for media models that'll expand my mind here?

    • echelon 10 hours ago

      We now have capacity to program and automate in the optics, signals, and spatial domains.

      As someone in the film space, here's just one example: we are getting extremely close to being able to make films with only AI tools.

      Nano Banana makes it easy to create character and location consistent shots that adhere to film language and the rules of storytelling. This still isn't "one shot", and considerable effort still needs to be put in by humans. Not unlike AI assistance in IDEs requiring a human engineer pilot.

      We're entering the era of two person film studios. You'll undoubtedly start seeing AI short films next year. I had one art school professor tell me that film seems like it's turning into animation, and that "photorealism" is just style transfer or an aesthetic choice.

      The film space is hardly the only space where these models have utility. There are so many domains. News, shopping, gaming, social media, phone and teleconference, music, game NPCs, GIS, design, marketing, sales, pitching, fashion, sports, all of entertainment, consumer, CAD, navigation, industrial design, even crazy stuff like VTubing, improv, and LARPing. So much of what we do as humans is non-text based. We haven't had effective automation for any of this until this point.

      This is a huge percentage of the economy. This is actually the beating heart of it all.

      • wild_egg 2 hours ago

        Been thinking about this. Curious why you positioned it as Nano Banana having more utility than agents when it seems like the next level even would be Nano Banana with agents?

        The two are kind of orthogonal concepts.

      • yunwal 10 hours ago

        > we are getting extremely close to being able to make films with only AI tools

        AI still can’t reliably write text on background details. It can’t get shadows right. If you ask it to shoot things from a head on perspective, for example a bookshelf, it fails to keep proportions accurate enough. The bookshelf will not have parallel shelves. The books won’t have text. If in a library, the labels will not be in Dewey decimal order.

        It still lacks a huge amount of understanding about how the world works necessary to make a film. It has its uses, but pretending like it can make a whole movie is laughable.