Comment by jstummbillig

Comment by jstummbillig 5 days ago

14 replies

I am guessing: Maybe you are not used to or comfortable with delegating work?

You will certainly understand a program better where you write every line of code yourself, but that limits your output. It's a trade-off.

The part that makes it work quite well is that you can also use the LLM to better understand the code where required, simply by asking.

brailsafe 5 days ago

> I am guessing: Maybe you are not used to or comfortable with delegating work?

The difference between delegating to a human vs an LLM is that a human is liable for understanding it, regardless of how it got there. Delegating to an LLM means you're just more rapidly creating liabilities for yourself, which indeed is a worthwhile tradeoff depending on the complexity of what you're losing intimate knowledge of.

  • jstummbillig 5 days ago

    The topic of liability is a difference but I think not an important one, if your objective is to get things done. In fact, humans being liable creates high incentives to obscure the truth, deceive, or move slowly to limit personal risk exposure, all of which are very real world hindrances.

    In the end the person in charge is liable either way, in different ways.

    • brailsafe 5 days ago

      > all of which are very real world hindrances.

      Real world responsibilities to manage, which sometimes can be hindrances at certain levels, but no functional society lets people just do arbitrary things at any speed regardless of impact to others in the name of a checklist. I mean that if I ask a person on my team that I trust to do something, they'll use a machine to do it, but if it's wrong, they're responsible for fixing it and maintaining the knowledge to know how to fix it. If a bridge fails, it's on the Professional Engineer who has signoff on the project, as well as the others doing the engineering work to make sure they make a bridge that doesn't collapse. If software engineers can remotely call themselves that without laughing, they need to consider their liability along the way, depending on circumstance.

  • Flere-Imsaho 5 days ago

    As a technical manager, I'm liable for every line of code we produce - regardless of who in the team actually wrote the code. This is why I review every pull request :)

    • dynamite-ready 5 days ago

      This is interesting. At what level and team size? There's going to have to be a point where you just give in to the 'vibes' (whether it's from a human, or a machine), otherwise you become the bottleneck, no?

      • 9cb14c1ec0 5 days ago

        Better a bottleneck than constant downtime.

        • [removed] 5 days ago
          [deleted]
      • Flere-Imsaho 5 days ago

        Only 4 or so people...so small, but that's how agile teams should be.

        • brailsafe 5 days ago

          I think there's a place for this, it's not rare for one person to be the PR bottleneck like this, but I don't think it would be for me in either position; people should be able to be responsible for reviewing each others work imo. Incidentally "Agile" with a capital A sucks and should die in a fire, but lowercase a "agile" probably does by necessity mean smaller teams.

tclancy 5 days ago

There is probably a case of people both being right here, just having gotten to, or found, different end results. For me, Claude has been a boon for prototyping stuff I always wanted to build but didn’t want to do the repetitive plumbing slog to get started, but I have found you hit a level of complexity where AIs bog down and start telling you they have fixed the bug you have just asked about for the sixth time without doing anything or bothering to check.

Maybe that’s just the level I gave up at and it’s a matter of reworking the Claude.md file and other documentation into smaller pieces and focusing the agent on just little things to get past it.

imron 5 days ago

I’m perfectly comfortable and used to delegating, but delegation requires trust that the result will be fit for purpose.

It doesn’t have to be exactly how I would do it but at a minimum it has to work correctly and have acceptable performance for the task at hand.

This doesn’t mean being super optimized just that it shouldn’t be doing stupid things like n+1 requests or database queries etc.

See a sibling comment for one example on correctness, another one related to performance was querying some information from a couple of database tables (the first with 50,000 rows the next with 2.5 million)

After specifying things in enough detail to let the AI go, it got correct results but performance was rather slow. A bit more back and forthing and it got up to processing 4,000 rows a second.

It was so impressed with its new performance it started adding rocket ship emojis to the output summary.

There were still some obvious (to me) performance issues so I pressed it to see if it could improve the performance. It started suggesting some database config tweaks which provided some marginal improvements but was still missing some big wins elsewhere - namely it was avoiding “expensive” joins and doing that work in the app instead - resulting in n+1 db calls.

So I suggested getting the DB to do the join and just processing the fully joined data on the app side. This doubled throughout (8,000 rows/second) and led to claims from the AI this was now enterprise ready code.

There was still low hanging fruit though because it was calling the db and getting all results back before processing anything.

After suggesting switching to streaming results (good point!) we got up to 10,000 rows/second.

This was acceptable performance, but after a bit more wrangling we got things up to 11,000 rows/second and now it wasn’t worth spending much extra time squeezing out more performance.

In the end the AI came to a good result, but, at each step of the way it was me hinting it in the correct direction and then the AI congratulating me on the incredible “world class performance” (actual quote but difficult to believe when you then double performance again).

If it has just been me I would have finished it in half the time.

If I’d delegated to a less senior employee and we’d gone back and forth a bit pairing to get it to this state it might have taken the same amount and effort but they would’ve at least learnt something.

Not so with the AI however - it learns nothing and I have to make sure I re-explain things and concepts all over again the next time and in sufficient detail that it will do a reasonable job (not expecting perfection, just needs to be acceptable).

And so my experience so far (much more than just these 2 examples) is that I can’t trust the AI to the point where I can delegate enough that I don’t spend more time supervising/correcting it than I would spend writing things myself.

Edit: using AI to explain existing code is a useful thing it can do well. My experience is it is much better at explaining code than producing it.

  • amitav1 5 days ago

    Not trying to downplay your grievances, but isn't this what [Skills](https://claude.com/skills) are for? After going back and forth on something like that, create a skill that's something along the lines of

    `database-query-speed-optimization` "Some rules of thumb for using database queries:

    - Use joins - Streaming results is faster - etc. "

    That way, the next time you have to do something like this, you can remind it of / it will find the skill.

    • imron 5 days ago

      Yeah it is but firstly this example was from before skills were a thing and secondly the rules might not be universally applicable.

      In this case the two tables shared 1:1 mapping of primary key to foreign key so the join was fast and exact - but there are situations where that won’t the case.

      And yeah this means slowly building out skills with enough conditions and rules and advice.

  • kukkeliskuu 5 days ago

    > It was so impressed with its new performance it started adding > rocket ship emojis to the output summary.

    I laughed more at this than I probably should have, out of recognition.