Comment by smaudet

Comment by smaudet 20 hours ago

8 replies

I guess my challenge is that "if it was a rote recitation of an idiomatic go function", was it worth writing?

There is a certain, style, lets say, of programming, that encourages highly non re-usable code that is both at once boring and tedious, and impossible to maintain and thus not especially worthwhile.

The "rote code" could probably have been expressed, succinctly, in terms that border on "plain text", but with more rigueur de jour, with less overpriced, wasteful, potentially dangerous models in-between.

And yes, machines like the eBPF verifier must follow strict rules to cut out the chaff, of which there is quite a lot, but it neither follows that we should write everything in eBPF, nor does it follow that because something can throw out the proverbial "garbage", that makes it a good model to follow...

Put another way, if it was that rote, you likely didn't need nor benefit from the AI to begin with, a couple well tested library calls probably sufficed.

sesm 17 hours ago

I would put it differently: when you already have a mental model of what the code is supposed to do and how, then reviewing is easy: just check that the code conforms to that model.

With an arbitrary PR from a colleague or security audit, you have to come up with mental model first, which is the hardest part.

tptacek 20 hours ago

Yes. More things should be rote recitations. Rote code is easy to follow and maintain. We get in trouble trying to be clever (or DRY) --- especially when we do it too early.

Important tangential note: the eBPF verifier doesn't "cut out the chaff". It rejects good, valid programs. It does not care that the programs are valid or good; it cares that it is not smart enough to understand them; that's all that matters. That's the point I'm making about reviewing LLM code: you are not on the hook for making it work. If it looks even faintly off, you can't hurt the LLM's feelings by killing it.

  • smaudet 20 hours ago

    > We get in trouble trying to be clever (or DRY)

    Certainly, however:

    > That's the point I'm making about reviewing LLM code: you are not on the hook for making it work

    The second portion of your statement is either confusing (something unsaid) or untrue (you are still ultimately on the hook).

    Agentic AI is just yet another, as you put it way to "get in trouble trying to be clever".

    My previous point stands - if it was that cut and dry, then a (free) script/library could generate the same code. If your only real use of AI is to replace template systems, congratulations on perpetuating the most over-engineered template system ever. I'll stick with a provable, free template system, or just not write the code at all.

    • vidarh 17 hours ago

      > The second portion of your statement is either confusing (something unsaid) or untrue (you are still ultimately on the hook).

      You're missing the point.

      tptacek is saying he isn't the one who needs to fix the issue because he can just reject the PR and either have the AI agent refine it or start over. Or ultimately resort to writing the code himself.

      He doesn't need to make the AI written code work, and so he doesn't need to spend a lot of time reading the AI written code - he can skim it for any sign it looks even faintly off and just kill it if that's the case instead of spending more time on it.

      > My previous point stands - if it was that cut and dry, then a (free) script/library could generate the same code.

      There's a vast chasm between simple enough that a non-AI code generator can generate it using templates and simple enough that a fast read-through is enough to show that it's okay to run.

      As an example, the other day I had my own agent generate a 1kloc API client for an API. The worst case scenario other than failing to work would be that it would do something really stupid, like deleting all my files. Since it passes its tests, skimming it was enough for me to have confidence that nowhere does it do any file manipulation other than reading the files passed in. For that use, that's sufficient since it otherwise passes the tests and I'll be the only user for some time during development of the server it's a client for.

      But no template based generator could write that code, even though it's fairly trivial - it involved reading the backend API implementation and rote-implementation of a client that matched the server.

      • smaudet 17 hours ago

        > But no template based generator could write that code, even though it's fairly trivial

        Not true at all, in fact this sort of thing used to happen all the time 10 years ago, code reading APIs and generating clients...

        > He doesn't need to make the AI written code work, and so he doesn't need to spend a lot of time reading the AI written code - he can skim it for any sign it looks even faintly off and just kill it if that's the case instead of spending more time on it.

        I think you are missing the point as well, that's still review, that's still being on the hook.

        Words like "skim" and "kill" are the problem here, not a solution. They point to a broken process that looks like its working...until it doesn't.

        But I hear you say "all software works like that", well, yes, to some degree. The difference being, one you hopefully actually wrote and have some idea what's going wrong, the other one?

        Well, you just have to sort of hope it works and when it doesn't, well you said it yourself. Your code was garbage anyways, time to "kill" it and generate some new slop...