Comment by alexpham14

Comment by alexpham14 2 days ago

Compliance is usually the hard stop before we even get to capability. We can’t send code out, and local models are too heavy to run on the restricted VDI instances we’re usually stuck with. Even when I’ve tried it on isolated sandbox code, it struggles with the strict formatting. It tends to drift past column 72 or mess up period termination in nested IFs. You end up spending more time linting the output than it takes to just type it. It’s decent for generating test data, but it doesn't know the forty years of undocumented business logic quirks that actually make the job difficult.

apaprocki 2 days ago

To be fair, I would not expect a model to output perfectly formatted C++. I’d let it output whatever it wants and then run it through clang-format, similar to a human. Even the best humans that have the formatting rules in their head will miss a few things here or there.

If there are 40 years of undocumented business quirks, document them and then re-evaluate. A human new to the codebase would fail under the same conditions.

Reply View 7 replies

shakna 2 days ago

Formatting isn't just visual, in pre-79 COBOL or Fortran. It's syntax. Its a compile failure, or worse, it cuts the line and can sometimes successfully compile into something else.
Thats not just an undocumented quirk, but a fundamental part of being a punch-card ready language.

Reply View | 0 replies
raw_anon_1111 2 days ago

With C++ formatting is optional. A better test case for LLMs is Python where indention specifies code blocks. Even ChatGPT 3.5 got the formatting for Python and YAML correct - now the actual code back then was often hilariously wrong.

Reply View | 5 replies
- to11mtm 2 days ago
  
  I can't even get Github Copilot's plugin to avoid randomly trashing files with a Zero No width break space at the beginning, let alone follow formatting rules consistently...
  
  Reply View | 2 replies
  
  sothatsit 2 days ago
  
  > Github Copilot
  Well there’s your issue!
  
  Reply View | 0 replies
  
  raw_anon_1111 2 days ago
  
  I am the last person to say anything good about CoPilot. I used CoPilot for a minute, mostly used raw ChatGPT until last month and now use Codex with my personal subscription to ChatGPT and my personal but company reimbursed subscription to Claude.
  
  Reply View | 0 replies
- apaprocki 2 days ago
  
  A quick search finds many COBOL checkers. I’d be very surprised if a modern model was not able to fix its own mistakes if connected to a checker tool. Yes, it may not be able to one shot it perfectly, but if it can quickly call a tool once and it “works”, does it really matter much in the end? (Maybe it matters from a cost perspective, but I’m just referring to it solving the problem you asked it to solve.)
  Clearly it isn’t just “broken” for everyone, “Claude Code modernizes a legacy COBOL codebase”, from Anthropic:
  https://youtu.be/OwMu0pyYZBc
  
  Reply View | 1 reply
  
  shakna 2 days ago
  
  Taking Anthropic reporting on Anthropic, at face value, is not something you should really do.
  In this case, a five stage pipeline, built on demo environments and code that were already in the training data, was successful. I see more red flags there, than green.
  
  Reply View | 0 replies

akhil08agrawal 2 days ago

Nuances of a codebase are the key. But I guess we are accelerating towards solving that. Let's see how much time will this take.

Reply View 3 replies

layer8 2 days ago

The critical “why” knowledge often cannot be derived from the code base.
The prohibitions on other companies (LLM providers) being able to see your code also won’t be going away soon.

Reply View | 2 replies
- Muromec 2 days ago
  
  Other companies can see the code, that isn’t a problem. The problem with LLM is the idea that the code leaks out to companies other than LLM provider.
  That’s something that can be either solved for real or be promised to not happen.
  
  Reply View | 1 reply
  
  layer8 a day ago
  
  > Other companies can see the code, that isn’t a problem.
  It actually is a restriction in many industries.
  
  Reply View | 0 replies

nevinainfotechs 11 hours ago

[dead]

Reply View 0 replies