Comment by lolinder
Test driven development is sequenced the way it is for a reason. Getting a failing test first builds confidence that the test is, you know, actually testing something. And the process of writing the tests is often where the largest amount of reasoning about design choices takes place.
Having an LLM generate the tests after you've already written the code for them is super counterproductive. Who knows whether those tests actually test anything?
I know this gets into "I wanted AI to do my laundry, not my art" territory, but a far more rational division of labor is for the humans to write the tests (maybe with the assistance of an autocomplete model) and give those as context for the AI. Humans are way better at thinking of edge cases and design constraints than the models are at this point in the game.