Comment by rkuodys

Comment by rkuodys 5 days ago

4 replies

I am honestly curious about your point on productivity boost. Are you saying that you can write tests at the same speed as AI can? Or is it the point that tests written by AI is of much lower quality that is not worth using them? I am at the role of solo-preneur now and I see a lot of benefit from AI. But then I read posts like yours that experienced devs don't see much value in AI and I start to doubt the things I do. Are they bad quality(possibly) or is it something else going on.

imron 5 days ago

I’m not faster at writing tests than AI but my own code needs fewer tests.

When I’m writing my own code I can verify the logic as I go and coupled with a strong type system and a judicious use of _some_ tests its generally enough for my code to be correct.

By comparison the AI needs more tests to keep it on the right path otherwise the final code is not fit for purpose.

For example in a recent use case I needed to take a json blob containing an array of strings that contained numbers and needed to return an array of Decimals sorted in ascending order.

This seemed a perfect use case - a short well defined task with clear success criteria so I spent a bunch of time writing the requirements and building out a test suite and then let the AI do its thing.

The AI produced ok code, but it was sorted everything lexicographically before converting to a Decimal rather converting to Decimals first and sorting numerically so 1000 was less than 900.

So I point it out and the AI says good point, you’re absolutely correct and we add a test for this and it goes again and gets the right result but that’s not a mistake I would have made or needed a test for (though you could argue it’s a good test to have).

You could also argue that I should have specified the problem more clearly, but then we come back to the point that if I’m writing every specific detail in English first, it’s faster for me just to write it in code in the first place.

locknitpicker 5 days ago

> Are you saying that you can write tests at the same speed as AI can?

I feel this is a gross mischaracterization of any user flow involving using LLMs to generate code.

The hard part of generating code with LLMs is not how fast the code is generated. The hard part is verifying it actually does what it is expected to do. Unit tests too.

LLMs excel at spewing test cases, but you need to review each and every single test case to verify it does anything meaningful or valid and you need to iterate over tests to provide feedback on whether they are even green or what is the code coverage. That is the part that consumes time.

Claiming that LLMs are faster at generating code than you is like claiming that copy-and-pasting code out of Stack Overflow is faster than you writing it. Perhaps, but how can you tell if the code actually works?

welshwelsh 5 days ago

Try giving this prompt to your favorite LLM:

"Write unit tests with full line and branch coverage for this function:

def add_two_numbers(x, y): return x + y + 1 "

Sometimes the LLM will point out that this function does not, in fact, return the sum of x and y. But more often, it will happily write "assert add_two_numbers(1, 1) == 3", without comment.

The big problem is that LLMs will assume that the code they are writing tests for is correct. This defeats the main purpose of writing tests, which is to find bugs in the code.

  • lurking_swe 5 days ago

    Tip: teach it how to write tests properly. I’ll share what has worked pretty well for me.

    Run Cursor in “agent” mode, or create a Codex or Claude Code “unit test” skill. I recommend claude code.

    Explain to the LLM that after it creates or modifies a test, it must run the test to confirm it passes. If it fails, it’s not allowed to edit the source code, instead it must determine if there is a bug in the test or the source code. If the test is buggy it should try again, if there is a bug in the source code it should pause, propose a fix, and consult with you on next steps.

    The key insight here is you need to tell it that it’s not supposed to randomly edit the source code to make the test pass. I also recommend reviewing the unit tests at a high level, to make sure it didn’t hallucinate.