Comment by brushfoot

Comment by brushfoot 3 days ago

I read AI coding negativity on Hacker News and Reddit with more and more astonishment every day. It's like we live in different worlds. I expect the breadth of tooling is partly responsible. What it means to you to "use the LLM code" could be very different from what it means to me. What LLM are we talking about? What context does it have? What IDE are you using?

Personally, I wrote 200K lines of my B2B SaaS before agentic coding came around. With Sonnet 4 in Agent mode, I'd say I now write maybe 20% of the ongoing code from day to day, perhaps less. Interactive Sonnet in VS Code and GitHub Copilot Agents (autonomous agents running on GitHub's servers) do the other 80%. The more I document in Markdown, the higher that percentage becomes. I then carefully review and test.

systemf_omega 3 days ago

> B2B SaaS

Perhaps that's part of it.

People here work on all kinds of industries. Some of us are implementing JIT compilers, mission-critical embedded systems or distributed databases. In code bases like this you can't just wing it without breaking a million things, so LLM agents tend to perform really poorly.

Reply View 21 replies

sunrunner 3 days ago

> People here work on all kinds of industries.
Yes, it would be nice to have a lot more context (pun intended) when people post how many LoC they introduced.
B2B SaaS? Then can I assume that a browser is involved and that a big part of that 200k LoC is the verbose styling DSL we all use? On the other hand, Nginx, a production-grade web server, is 250k LoC (251,232 to be exact [1]). These two things are not comparable.
The point being that, as I'm sure we all agree, LoC is not a helpful metric for comparison without more context, and different projects have vastly different amounts of information/feature density per LoC.
[1] https://openhub.net/p/nginx

Reply View | 8 replies
- Fr0styMatt88 2 days ago
  
  I primarily work in C# during the day but have been messing around with simple Android TV dev on occasion at night.
  I’ve been blown away sometimes at what Copilot puts out in the context of C#, but using ChatGPT (paid) to get me started on an Android app - totally different experience.
  Stuff like giving me code that’s using a mix of different APIs and sometimes just totally non-existent methods.
  With Copilot I find sometimes it’s brilliant but it’s so random as to when that will be it seems.
  
  Reply View | 1 reply
  
  motorest 2 days ago
  
  > Stuff like giving me code that’s using a mix of different APIs and sometimes just totally non-existent methods.
  That has been my experience as well. We can control the surprising pick of APIs with basic prompt files that clarify what and how to use in your project. However, when using less-than-popular tools whose source code is not available, the hallucinations are unbearable and a complete waste of time.
  The lesson to be learned is that LLMs depend heavily on their training set, and in a simplistic way they at best only interpolate between the data they were fed. If a LLM is not trained with a corpus covering a specific domain them you can't expect usable results from it.
  This brings up some unintended consequences. Companies like Microsoft will be able to create incentives to use their tech stack by training their LLMs with a very thorough and complete corpus on how to use their technologies. If Copilot does miracles outputting .NET whereas Java is unusable, developers have one more reason to adopt .NET to lower their cost of delivering and maintaining software.
  
  Reply View | 0 replies
- godelski 2 days ago
  
  > when people post how many LoC they introduced.
  Pretty ironic you and the GP talk about lines of code.
  From the article:
  Garman is also not keen on another idea about AI – measuring its value by what percentage of code it contributes at an organization. “It’s a silly metric,” he said, because while organizations can use AI to write “infinitely more lines of code” it could be bad code. “Often times fewer lines of code is way better than more lines of code,” he observed. “So I'm never really sure why that's the exciting metric that people like to brag about.”
  I'm with Garman here. There's no clean metric for how productive someone is when writing code. At best, this metric is naive, but usually it is just idiotic.
  Bureaucrats love LoC, commits, and/or Jira tickets because they are easy to measure but here's the truth: to measure the quality of code you have to be capable of producing said code at (approximately) said quality or better. Data isn't just "data" that you can treat as a black box and throw in algorithms. Data requires interpretation and there's no "one size fits all" solution. Data is nothing without its context. It is always biased and if you avoid nuance you'll quickly convince yourself of falsehoods. Even with expertise it is easy to convince yourself of falsehoods. Without expertise it is hopeless. Just go look at Reddit or any corner of the internet where there's armchair experts confidently talking about things they know nothing about. It is always void of nuance and vastly oversimplified. But humans love simplicity. You need to recognize our own biases.
  
  Reply View | 5 replies
  
  sunrunner 2 days ago
  
  > Pretty ironic you and the GP talk about lines of code.
  I was responding specifically to the comment I replied to, not the article, and mentioning LoC as a specific example of things that don't make sense to compare.
  
  Reply View | 1 reply
  
  godelski 2 days ago
  
  > the comment I replied to
  Which was the "GP", or "grand parent" (your comment is the parent of my comment), that I was referring to.
  
  Reply View | 0 replies
  
  darkwater 2 days ago
  
  > Bureaucrats love LoC
  Looks like vibe-coders love them too, now.
  
  Reply View | 2 replies
drusepth 3 days ago

On the other hand, fault-intolerant codebases are also often highly defined and almost always have rigorous automated tests already, which are two contexts where coding agents specifically excel in.

Reply View | 0 replies
JambalayaJimbo 3 days ago

I work on brain dead crud apps much of my time and get nothing from LLMs.

Reply View | 9 replies
- benjaminwootton 2 days ago
  
  Try Claude Code. You’ll literally be able to automate 90% of the coding part of your job.
  
  Reply View | 7 replies
  
  dns_snek 2 days ago
  
  We really need to add some kind of risk to people making these claims to make it more interesting. I listened to the type of advice you're giving here on more occasions than I can remember, at least once for every major revision of every major LLM and always walked away frustrated because it hindered me more than it helped.
  > This is actually amazing now, just use [insert ChatGPT, GPT-4, 4.5, 5, o1, o3, Deepseek, Claude 3.5, 3.9, Gemini 1, 1.5, 2, ...] it's completely different from Model(n-1) you've tried.
  I'm not some mythical 140 IQ 10x developer and my work isn't exceptional so this shouldn't happen.
  
  Reply View | 4 replies
  
  delta_p_delta_x 2 days ago
  
  I've been working on macOS and Windows drivers. Can't help but disagree.
  Because of the absolute dearth of high-quality open-source driver code and the huge proliferation of absolutely bottom-barrel general-purpose C and C++, the result is... Not good.
  On the other hand, I asked Claude to convert an existing, short-ish Bash script to idiomatic PowerShell with proper cmdlet-style argument parsing, and it returned a decent result that I barely had to modify or iterate on. I was quite impressed.
  Garbage in, garbage out. I'm not altogether dismissive of AI and LLMs but it is really necessary to know where and what their limits are.
  
  Reply View | 1 reply
  
  Sharlin 2 days ago
  
  I'm pretty sure the GP referred to GGP's "brain dead CRUD apps" when they talked about automating 90% of the work.
  
  Reply View | 0 replies
- murukesh_s 2 days ago
  
  I found the opposite - I am able to get 50% improvement in productivity for day to day coding (mix of backend, frontend), mostly in Javascript but have helped in other languages. But you have to carefully review though - and have extremely well written test cases if you have to blindly generate or replace existing code.
  
  Reply View | 0 replies
motorest 2 days ago

> In code bases like this you can't just wing it without breaking a million things, so LLM agents tend to perform really poorly.
This is a false premise. LLMs themselves don't force you to introduce breaking changes into your code.
In fact, the inception of coding agents was lauded as a major improvement to the developer experience because they allow the LLMs themselves to automatically react to feedback from test suites, thus speeding up how code was implemented while preventing regressions.
If tweaking your code can result in breaking a million things, this is a problem with your code and how you worked to make it resilient. LLMs are only able to introduce regressions if your automated tests are unable to catch any of these million of things breaking. If this is the case then your problems are far greater than LLMs existing, and at best LLMs only point out the elephant in the room.

Reply View | 0 replies

malfist 3 days ago

Perhaps the issue is you were used to writing 200k lines of code. Most engineers would be agast at that. Lines of code is a debit not a credit

Reply View 17 replies

Deestan 3 days ago

I am now making an emotional reaction based on zero knowledge of the B2B codebase's environment, but to be honest I think it is relevant to the discussion on why people are "worlds apart".
200k lines of code is a failure state. At this point you have lost control and can only make changes to the codebase through immense effort, and not at a tolerable pace.
Agentic code writers are good at giving you this size of mess and at helping to shovel stuff around to make changes that are hard for humans due to the unusable state of the codebase.
If overgrown barely manageble codebases are all a person's ever known and they think it's normal that changes are hard and time-consuming and needing reams of code, I understand that they believe AI agents are useful as code writers. I think they do not have the foundation to tell mediocre from good code.
I am extremely aware of the judgemental hubris of this comment. I'd not normally huff my own farts in public this obnoxiously, but I honestly feel it is useful for the "AI hater vs AI sucker" discussion to be honest about this type of emotion.

Reply View | 6 replies
- mind-blight 3 days ago
  
  It really depends on what your use case is. E.g. of you're dealing with a lot of legacy integrations, dealing with all the edge cases can require a lot of code that you can't refactor away through cleverness.
  Each integration is hopefully only a few thousand lines of code, but if you have 50 integrations you can easily break 100k loc just dealing with those. They just need to be encapsulated well so that the integration cruft is isolated from the core business logic, and they become relatively simple to reason about
  
  Reply View | 0 replies
- bubblyworld 2 days ago
  
  > 200k lines of code is a failure state.
  What on earth are you talking about? This is unavoidable for many use-cases, especially ones that involve interacting with the real world in complex ways. It's hardly a marker of failure (or success, for that matter) on its own.
  
  Reply View | 0 replies
- haskellshill 17 hours ago
  
  If all your code depends on all your other code, yeah 200k lines might be a lot. But if you actually know how to code, I fail to understand why 200k lines (or any number) of properly encapsulated well-written code would be a problem.
  Further, if you yourself don't understand the code, how can you verify that using LLMs to make major sweeping changes, doesn't mess anything up, given that they are notorious for making random errors?
  
  Reply View | 0 replies
- throwawaymaths 2 days ago
  
  200k loc is not a failure state. suppose your b2b saas has 5 user types and 5 downstream SAASes it connects to, thats 20k loc per major programming unit. not so bad.
  
  Reply View | 1 reply
  
  krainboltgreene 2 days ago
  
  That's actually insane.
  
  Reply View | 0 replies
- johnnyanmac 3 days ago
  
  I agree on principle, and I'm sure many of us know how much of a pain it is to work on million or even billion dollar codebases, where even small changes can be weeks of beauracracy and hours of meetings.
  But with the way the industry is, I'm also not remotely surprised. We have people come and go as they are poached, burned out, or simply life circumstances. The training for the new people isn't the best, and the documentation for any but the large companies are probably a mess. We also don't tend to encourage periods to focus on properly addressing tech debt, but focusing on delivering features. I don't know how such an environment over years, decades doesn't generate so much redundant, clashing, and quirky interactions. The culture doesn't allow much alternative.
  And of course, I hope even the most devout AI evangelists realize that AI will only multiply this culture. Code that no one may even truly understand, but "it works". I don't know if even Silicon Valley (2014) could have made a parody more shocking than the reality this will yield.
  
  Reply View | 0 replies
rootnod3 3 days ago

In that case, LLMs are full on debt-machines.

Reply View | 4 replies
- threecheese 3 days ago
  
  Ones that can remediate it though. If I am capable of safely refactoring 1,000 copies of a method, in a codebase that humans don’t look at, did it really matter if the workload functions as designed?
  
  Reply View | 3 replies
  
  sdenton4 2 days ago
  
  Jeebus, 'safely' is carrying a hell of a lot of water there...
  
  Reply View | 0 replies
  
  JustExAWS 2 days ago
  
  In a type safe language like C# or Java, why could you need an LLM for that? it’s a standard guaranteed safe (as long as you aren’t using reflection) refactor with ReSharper.
  
  Reply View | 0 replies
  
  uoaei 2 days ago
  
  Features present in all IDEs over the last 5 years or so are better and more verifiably correct for this task than probabilistic text generators.
  
  Reply View | 0 replies
d0mine 2 days ago

You might have meant "code is a liability not an asset"

Reply View | 0 replies
rahimnathwani 3 days ago
Lines of code is a debit not a credit
Perhaps you meant this the other way around. A credit entry indicates an increase in the amount you owe.
Reply View | 3 replies
- bmurphy1976 3 days ago
  
  It's a terrible analogy either way. It should be each extra line of code beyond the bare minimum is a liability.
  
  Reply View | 0 replies
- malfist 2 days ago
  
  You are absolutely correct, I am not a finance wizard
  
  Reply View | 1 reply
  
  aspenmayer 2 days ago
  
  Liability vs asset is what you were trying to say, I think, but everyone says that, so to be charitable I think you were trying to put a new spin on the phrasing, which I think is admirable, to your credit.
  
  Reply View | 0 replies

s1mplicissimus 3 days ago

It's interesting how LLM enthusiasts will point to problems like IDE, context, model etc. but not the one thing that really matters:

Which problem are you trying to solve?

At this point my assumption is they learned that talking about this question will very quickly reveal that "the great things I use LLMs for" are actually personal throwaway pieces, not to be extended above triviality or maintained over longer than a year. Which, I guess, doesn't make for a great sales pitch.

Reply View 3 replies

phito 2 days ago

It's amazing to make small custom apps and scripts, and they're such high quality (compared to what I would half-ass write and never finish/polish them) that they don't end up as "throwaway", I keep using them all the time. The LLM is saving me time to write these small programs, and the small programs boost my productivity.
Often, I will solve a problem in a crappy single-file script, then feed it to Claude and ask to turn it into a proper GUI/TUI/CLI, add CI/CD workflows, a README, etc...
I was very skeptical and reluctant of LLM assisted coding (you can look at my history) until I actually tried it last month. Now I am sold.

Reply View | 0 replies
maigret 2 days ago

At work I need often smaller, short lived scripts to find this or that insight, or to use visualization to render some data and I find LLMs very useful at that.
A non coding topic, but recently I had difficulty articulating a summarized state of a complex project, so I spoke 2 min in the microphone and it gave me a pretty good list of accomplishments, todos and open points.
Some colleagues have found them useful for modernizing dependencies of micro services or to help getting a head start on unit test coverage for web apps. All kinds of grunt work that’s not really complex but just really moves quite some text.
I agree it’s not life changing, but a nice help when needed.

Reply View | 0 replies
wan23 2 days ago

I use it to do all the things that I couldn't be bothered to do before. Generate documentation, dump and transform data for one off analyses, write comprehensive tests, create reports. I don't use it for writing real production code unless the task is very constrained with good test coverage, and when I do it's usually to fix small but tedious bugs that were never going to get prioritized otherwise.

Reply View | 0 replies

Ballas 2 days ago

There is definitely a divide in users - those for which it works and those for which it doesn't. I suspect it comes down to what language and what tooling you use. People doing web-related or python work seem to be doing much better than people doing embedded C or C++. Similarly doing C++ in a popular framework like QT also yields better results. When the system design is not pre-defined or rigid like in QT, then you get completely unmaintainable code as a result.

If you are writing code that is/can be "heavily borrowed" - things that have complete examples on Github, then an LLM is perfect.

Reply View 38 replies

hn_throwaway_99 2 days ago

While I agree that AI assisted coding probably works much better for languages and use cases that have a lot more relevant training data, when I read comments from people who like LLM assisted coding vs. those that don't, I strongly get the impression that the difference has a lot more to do with the programmers than their programming language.
The primary difference I see in people who get the most value from AI tools is that they expect it to make mistakes: they always carefully review the code and are fine with acting, in some cases, more like an editor than an author. They also seem to have a good sense of where AI can add a lot of value (implementing well-defined functions, writing tests, etc.) vs. where it tends to fall over (e.g. tasks where large scale context is required). Those who can't seem to get value from AI tools seem (at least to me) less tolerant of AI mistakes, and less willing to iterate with AI agents, and they seem more willing to "throw the baby out with the bathwater", i.e. fixate on some of the failure cases but then not willing to just limit usage to cases where AI does a better job.
To be clear, I'm not saying one is necessarily "better" than the other, just that the reason for the dichotomy has a lot more to do with the programmers than the domain. For me personally, while I get a lot of value in AI coding, I also find that I don't enjoy the "editing" aspect as much as the "authoring" aspect.

Reply View | 10 replies
- paufernandez 2 days ago
  
  Yes, and each person has a different perception of what is "good enough". Perfectionists don't like AI code.
  
  Reply View | 3 replies
  
  skydhash 2 days ago
  
  My main reason is: Why should I try twice or more, when I can do it once and expand my knowledge? It's not like I have to produce something now.
  
  Reply View | 2 replies
- robenkleene 2 days ago
  
  > I strongly get the impression that the difference has a lot more to do with the programmers than their programming language.
  The problem with this perspective is that anyone who works on more niche programming areas knows the vast majority of programming discussion online aren't relevant to them. E.g., I've done macOS/iOS programming most of my career, and I now do work that's an order of magnitude more niche than that, and I commonly see programmers saying thing like "you shouldn't use a debugger", which is a statement that I can't imagine a macOS or iOS programmer saying (don't get me wrong they're probably out there, I've just never met or encountered one). So you just become use to most programming conversations being irrelevant to your work.
  So of course the majority of AI conversations aren't relevant to your work either, because that's the expectation.
  I think a lot of these conversations are two people with wildly different contexts trying to communicate, which is just pointless. Really we just shouldn't be trying to participate in these conversations (the more niche programmers that is), because there's just not enough shared context to make communication effective.
  We just all happen to fall under this same umbrella of "programming", which gives the illusion of a shared context. It's true there's some things that are relevant across the field (it's all just variables, loops, and conditionals), but many of the other details aren't universal, so it's silly to talk about them without first understanding the full context around the other persons work.
  
  Reply View | 2 replies
  
  hn_throwaway_99 2 days ago
  
  > and I commonly see programmers saying thing like "you shouldn't use a debugger"
  Sorry, but who TF says that? This is actually not something I hear commonly, and if it were, I would just discount this person's opinion outright unless there were some other special context here. I do a lot of web programming (Node, Java, Python primarily) and if someone told me "you shouldn't use a debugger" in those domains I would question their competence.
  
  Reply View | 1 reply
  
  robenkleene 2 days ago
  
  E.g., https://news.ycombinator.com/item?id=39652860 (no specific comment, just the variety of opinions)
  Here's a good specific example https://news.ycombinator.com/item?id=26928696
  
  Reply View | 0 replies
- felipeerias 2 days ago
  
  It might boil down to individual thinking styles, which would explain why people tend to talk past each other in these discussions.
  
  Reply View | 0 replies
- jappgar 2 days ago
  
  No one likes to hear it, but it comes down to prompting skill. People who are terrible at communicating and delegating complex tasks will be terrible at prompting.
  It's no secret that a lot of engineers are bad at this part of the job. They prefer to work alone (i.e. without AI) because they lack the ability to clearly and concisely describe problems and solutions.
  
  Reply View | 1 reply
  
  JackFr 2 days ago
  
  This. I work with juniors who have no idea what a spec is, and the idea of designing precisely what a component should do, especially in error cases, is foreign to them.
  One key to good prompting is clear thinking.
  
  Reply View | 0 replies
motorest 2 days ago

> If you are writing code that is/can be "heavily borrowed" - things that have complete examples on Github, then an LLM is perfect.
I agree with the general premise. There is however more to it than "heavily borrowed". The degree to which a code base is organized and structured and curated plays as big of a role as what framework you use.
If your project is a huge pile of unmaintainable and buggy spaghetti code then don't expect a LLM to do well. If your codebase is well structured, clear, and follows patterns systematically the of course a glorified pattern matching service will do far better in outputting acceptable results.
There is a reason why one of the most basic vibecoding guidelines is to include a prompt cycle to clean up and refactor code between introducing new features. LLMs fare much better when the project in their context is in line with their training. If you refactor your project to align it with what a LLM is trained to handle, it will do much better when prompted to fill in the gaps. This goes way beyond being "heavily borrowed".
I don't expect your average developer struggling with LLMs to acknowledge this fact, because then they would need to explain why their work is unintelligible to a system trained on vast volumes of code. Garbage in, garbage out. But who exactly created all the garbage going in?

Reply View | 0 replies
pydry 2 days ago

I suspect it comes down to how novel the code you are writing is and how tolerant of bugs you are.
People who use it to create a proof of concept of something that is in the LLM training set will have a wildly different experience to somebody writing novel production code.
Even there the people who rave the most rave about how well it does boilerplate.

Reply View | 0 replies
jstummbillig 2 days ago

> When the system design is not pre-defined or rigid like
Why would a LLM be any worse building from language fundamentals (which it knows, in ~every language)? Given how new this paradigm is the far more obvious and likely explanation seems to be: LLM powered coding requires somewhat different skills and strategies. The success of each user heavily depends on their learning rate.

Reply View | 0 replies
PUSH_AX 2 days ago

I think there are still lots of code “artisans” who are completely dogmatic about what code should look like, once the tunnel vision goes and you realise the code just enables the business it all of a sudden becomes a velocity God send.

Reply View | 23 replies
- gtsop 2 days ago
  
  Two years in and we are waiting to see all you people (who are free of our tunnel vision) fly high with your velocity. I don't see anyone, am I doing something wrong?
  Your words predict an explosion of unimaginary magnitude for new code and for new buisnesses. Where is it? Nowhere.
  Edit: And dont start about how you vibed a SaaS service, show income numbers from paying customers (not buyouts)
  
  Reply View | 14 replies
  
  hn_throwaway_99 2 days ago
  
  There was this recent post about a Cloudflare OAuth client where the author checked in all the AI prompts, https://news.ycombinator.com/item?id=44159166.
  The author of the library (kentonv) comments in the HN thread that he said it took him a few days to write the library with AI help, while he thinks it would have taken weeks or months to write manually.
  Also, while it may be technically true we're "two years in", I don't think this is a fair assessment. I've been trying AI tools for a while, and the first time I felt "OK, now this is really starting to enhance my velocity" was with the release of Claude 4 in May of this year.
  
  Reply View | 1 reply
  
  ath92 2 days ago
  
  But that example is of writing a green field library that deals with an extremely well documented spec. While impressive, this isn’t what 99% of software engineering is. I’m generally a believer/user but this is a poor example to point at and say “look, gains”.
  
  Reply View | 0 replies
  
  PUSH_AX 2 days ago
  
  Do you have some magical insight into every codebase in existence? No? Ok then…
  
  Reply View | 11 replies
- imiric 2 days ago
  
  The issue is not with how code looks. It's with what it does, and how it does it. You don't have to be an "artisan" to notice the issues moi2388 mentioned.
  The actual difference is between people who care about the quality of the end result, and the experience of users of the software, and those who care about "shipping quickly" no matter the state of what they're producing.
  This difference has always existed, but ML tools empower the latter group much more than the former. The inevitable outcome of this will be a stark decline of average software quality, and broad user dissatisfaction. While also making scammers and grifters much more productive, and their scams more lucrative.
  
  Reply View | 2 replies
  
  Buttons840 2 days ago
  
  Certainly billions of people's personal data will be leaked, and nobody will be held responsible.
  
  Reply View | 0 replies
  
  airtonix 2 days ago
  
  [dead]
  
  Reply View | 0 replies
- cowl 2 days ago
  
  There are very good reason that code should look a certain way and it comes from years of experience and the fact that code is written once but read and modified much more.
  When the first bugs come up you see that the velocity was not god sent and you end up hiring one of the many "LLM code fixer" companies that are poping up like mushrooms.
  
  Reply View | 3 replies
  
  PUSH_AX 2 days ago
  
  You’re confusing yoloing code into prod and using ai to increase velocity while ensuring it functions and is safe.
  
  Reply View | 2 replies
- Buttons840 2 days ago
  
  I'm not a code "artisan", but I do believe companies should be financially responsible when they have security breaches.
  
  Reply View | 0 replies

codingdave 3 days ago

And also ask: "How much money do you spend on LLMs?"

In the long run, that is going to be what drives their quality. At some point the conversation is going to evolve from whether or not AI-assisted coding works to what the price point is to get the quality you need, and whether or not that price matches its value.