Comment by adriand
Fascinating video. I watched almost the whole thing without planning to, I got sucked in.
This is one of those examples of software that reminds me of my struggle to understand how LLMs are passing code evaluations that culminate with people declaring that they are now better than even the best human coders. I have tried to get LLMs (specifically, Claude and ChatGPT, trying various models) to assist with niche problems and it's been a terrible experience. Fantastic with CRUD or common algorithms, terrible when it's something novel or unusual.
The author creates his own version of a "FLIP simulation". I'm going to go out on a limb and posit that even ChatGPT's unreleased o3 model would not be up to the task of writing the software that powers this pendant. Is this incorrect? I realize perhaps that my comment is a little off-topic given that this is not an AI project. However, this project seems like an excellent example of the sort of thing that I am quite skeptical the supposedly "world-class" artificial software engineers could pull off.
I've implemented fluid mechanics using Claude (through Cursor) and it had no problem writing the logic and integrating it using my custom physics engine and custom renderer.
So no, I don't think your assessment is correct. LLMs shine when they get to implement something from scratch on a blank slate with clear API boundaries, whether it's a CRUD app or a physics simulation. Where I think they struggle the most is in big legacy codebases on tasks spanning multiple modules with lots of red herrings.