Comment by grayhatter

Comment by grayhatter 2 days ago

29 replies

> I suspect you’re misjudging the friend here. This sounds more like the famous “no brown m&ms” clause in the Van Halen performance contract. As ridiculous as the request is, it being followed provides strong evidence that the rest (and more meaningful) of the requests are.

I'd argue, it's more like you've bought so much into the idea this is reasonable, that you're also willing to go through extreme lengths to recon and pretend like this is sane.

Imagine two different worlds, one where the tools that engineers use, have a clear, and reasonable way to detect and determine if the generative subsystem is still on the rails provided by the controller.

And another world where the interface is completely devoid of any sort of basic introspection interface, and because it's a problematic mess, all the way down, everyone invents some asinine way that they believe provides some sort of signal as to whether or not the random noise generator has gone off the rails.

> Sounds like the friend understands quite well how LLMs actually work and has found a clever way to be signaled when it’s starting to go off the rails.

My point is that while it's a cute hack, if you step back and compare it objectively, to what good engineering would look like. It's wild so many people are all just willing to accept this interface as "functional" because it means they don't have to do the thinking that required to emit the output the AI is able to, via the specific randomness function used.

Imagine these two worlds actually do exist; and instead of using the real interface that provides a clear bool answer to "the generative system has gone off the rails" they *want* to be called Mr Tinkerberry

Which world do you think this example lives in? You could convince me, Mr Tinkleberry is a cute example of the latter, obviously... but it'd take effort to convince me that this reality is half reasonable or that's it's reasonable that people who would want to call themselves engineers should feel proud to be a part of this one.

Before you try to strawman my argument, this isn't a gatekeeping argument. It's only a critical take on the interface options we have to understand something that might as well be magic, because that serves the snakeoil sales much better.

> > Is the magic token machine working?

> Fuck I have no idea dude, ask it to call you a funny name, if it forgets the funny name it's probably broken, and you need to reset it

Yes, I enjoy working with these people and living in this world.

gyomu 2 days ago

It is kind of wild that not that long ago the general sentiment in software engineering (at least as observed on boards like this one) seemed to be about valuing systems that were understandable, introspectable, with tight feedback loops, within which we could compose layers of abstractions in meaningful and predictable ways (see for example the hugely popular - at the time - works of Chris Granger, Bret Victor, etc).

And now we've made a complete 180 and people are getting excited about proprietary black boxes and "vibe engineering" where you have to pretend like the computer is some amnesic schizophrenic being that you have to coerce into maybe doing your work for you, but you're never really sure whether it's working or not because who wants to read 8000 line code diffs every time you ask them to change something. And never mind if your feedback loops are multiple minutes long because you're waiting on some agent to execute some complex network+GPU bound workflow.

  • adastra22 2 days ago

    You don’t think people are trying very hard to understand LLMs? We recognize the value of interpretability. It is just not an easy task.

    It’s not the first time in human history that our ability to create things has exceeded our capacity to understand.

    • grayhatter 2 days ago

      > You don’t think people are trying very hard to understand LLMs? We recognize the value of interpretability. It is just not an easy task.

      I think you're arguing against a tangential position to both me, and the person this directly replies to. It can be hard to use and understand something, but if you have a magic box that you can't tell if it's working. It doesn't belong anywhere near the systems that other humans use. The people that use the code you're about to commit to whatever repo you're generating code for, all deserve better than to be part of your unethical science experiment.

      > It’s not the first time in human history that our ability to create things has exceeded our capacity to understand.

      I don't agree this is a correct interpretation of the current state of generative transformer based AI. But even if you wanted to try to convince me; my point would still be, this belongs in a research lab, not anywhere near prod. And that wouldn't be a controversial idea in the industry.

      • adastra22 2 days ago

        We used the steam engine for 100 years before we had a firm understanding of why it worked. We still don’t understand how ice skating works. We don’t have a physical understanding of semi-fluid flow in grain silos, but we’ve been using them since prehistory.

        I could go on and on. The world around you is full of not well understood technology, as well as non deterministic processes. We know how to engineer around that.

      • nineteen999 2 days ago

        > It doesn't belong anywhere near the systems that other humans use

        Really for those of us who actually work in critical systems (emergency services in my case) - of course we're not going to start patching the core applications with vibe code.

        But yeah, that frankenstein reporting script that half a dozen amateur hackers made a mess of over 20 years instead of refactoring and redesigning? That's prime fodder for this stuff. NOBODY wants to clean that stuff up by hand.

    • gyomu 2 days ago

      Your comment would be more useful if you could point us to some concrete tooling that’s been built out in the last ~3 years that LLM assisted coding has been around to improve interpretability.

      • adastra22 2 days ago

        That would be the exact opposite of my claim: it is a very hard problem.

orbital-decay 2 days ago

This reads like you either have an idealized view of Real Engineering™, or used to work in a stable, extremely regulated area (e.g. civil engineering). I used to work in aerospace in the past, and we had a lot of silly Mr Tinkleberry canaries. We didn't strictly rely on them because our job was "extremely regulated" to put it mildly, but they did save us some time.

There's a ton of pretty stable engineering subfields that involve a lot more intuition than rigor. A lot of things in EE are like that. Anything novel as well. That's how steam in 19th century or aeronautics in the early 20th century felt. Or rocketry in 1950s, for that matter. There's no need to be upset with the fact that some people want to hack explosive stuff together before it becomes a predictable glacier of Real Engineering.

  • grayhatter 2 days ago

    > There's no need to be upset with the fact that some people want to hack explosive stuff together before it becomes a predictable glacier of Real Engineering.

    You misunderstand me. I'm not upset that people are playing with explosives. I'm upset that my industry is playing with explosives that all read, "front: face towards users"

    And then, more upset that we're all seemingly ok with that.

    The driving force of enshittifacation of everything, may be external, but degradation clearly comes from engineers first. These broader industry trends only convince me it's not likely to get better anytime soon, and I don't like how everything is user hostile.

  • gyomu 2 days ago

    Man I hate this kind of HN comment that makes grand sweeping statement like “that’s how it was with steam in the 19th century or rocketry in the 1950s”, because there’s no way to tell whether you’re just pulling these things out of your… to get internet points or actually have insightful parallels to make.

    Could you please elaborate with concrete examples on how aeronautics in the 20th century felt like having a fictional friend in a text file for the token predictor?

    • orbital-decay 2 days ago

      We're not going to advance the discussion this way. I also hate this kind of HN comment that makes grand sweeping statement like "LLMs are like having a fictional friend in a text file for the token predictor", because there's no way to tell whether you're just pulling these things out of your... to get internet points or actually have insightful parallels to make.

      Yes, during the Wright era aeronautics was absolutely dominated by tinkering, before the aerodynamics was figured out. It wouldn't pass the high standard of Real Engineering.

      • grayhatter 2 days ago

        > Yes, during the Wright era aeronautics was absolutely dominated by tinkering, before the aerodynamics was figured out. It wouldn't pass the high standard of Real Engineering.

        Remind me: did the Wright brothers start selling tickets to individuals telling them it was completely safe? Was step 2 of their research building a large passenger plane?

        I originally wanted to avoid that specific flight analogy, because it felt a bit too reductive. But while we're being reductive, how about medicine too; the first smallpox vaccine was absolutely not well understood... would that origin story pass ethical review today? What do you think the pragmatics would be if the medical profession encouraged that specific kind of behavior?

        > It wouldn't pass the high standard of Real Engineering.

        I disagree, I think it 100% is really engineering. Engineering at it's most basic is tricking physics into doing what you want. There's no more perfect example of that than heavier than air flight. But there's a critical difference between engineering research, and experimenting on unwitting people. I don't think users need to know how the sausage is made. That counts equally to planes, bridges, medicine, and code. But the professionals absolutely must. It's disappointing watching the industry I'm a part of willingly eschew understanding to avoid a bit of effort. Such a thing is considered malpractice in "real professions".

        Ideally neither of you to wring your hands about the flavor or form of the argument, or poke fun at the gamified comment thread. But if you're gonna complain about adding positively to the discussion, try to add something to it along with the complaints?

solumunus 2 days ago

I use agents almost all day and I do way more thinking than I used to, this is why I’m now more productive. There is little thinking required to produce output, typing requires very little thinking. The thinking is all in the planning… If the LLM output is bad in any given file I simply step in and modify it, and obviously this is much faster than typing every character.

I’m spending more time planning and my planning is more comprehensive than it used to be. I’m spending less time producing output, my output is more plentiful and of equal quality. No generated code goes into my commits without me reviewing it. Where exactly is the problem here?

adastra22 2 days ago

It feels like you’re blaming the AI engineers here, that they built it this way out of ignorance or something. Look into interpretability research. It is a hard problem!

  • grayhatter 2 days ago

    I am blaming the developers who use AI because they're willing to sacrifice intellectual control in trade for something that I find has minimal value.

    I agree it's likely to be a complex or intractable problem. But I don't enjoy watching my industry revert down the professionalism scale. Professionals don't choose tools that they can't explain how it works. If your solution to understanding if your tool is still functional is inventing an amusing name and trying to use that as the heuristic, because you have no better way to determine if it's still working correctly. That feels like it might be a problem, no?

    • adastra22 2 days ago

      I’m sorry you don’t like it. But this has very strong old-man-yells-at-cloud vibes. This train is moving, whether you want it to or not.

      Professionals use tools that work, whether they know why it works is of little consequence. It took 100 years to explain the steam engine. That didn’t stop us from making factories and railroads.

      • grayhatter 2 days ago

        > It took 100 years to explain the steam engine. That didn’t stop us from making factories and railroads.

        You keep saying this, why do you believe it so strongly? Because I don't believe this is true. Why do you?

        And then, even assuming it's completely true exactly as stated; shouldn't we have higher standards than that when dealing with things that people interact with? Boiler explosions are bad right? And we should do everything we can to prove stuff works the way we want and expect? Do you think AI, as it's currently commonly used, helps do that?

pacifika 2 days ago

This could be a very niche standup comedy routine, I approve.