Comment by jclay
Exciting work! I’ve often wondered if an LLM with the right harness could restore and optimize an aging C/C++ codebase. It would be quite compelling to get an old game engine running again on a modern system.
I would expect most of these systems come with very carefully guarded access controls. It also strikes me as a uniquely difficult challenge to track down the decision maker who is willing to take the risk on revamping these systems (AI or not). Curious to hear more about what you’ve learned here.
Also curious to hear how LLMs perform on a language like COBOL that likely doesn’t have many quality samples in the training data.
Thank you!
The decision makers we work with are typically modernization leaders and mainframe owners — usually director or VP level and above. There are a few major tailwinds helping us get into these enterprises:
1. The SMEs who understand these systems are retiring, so every year that passes makes the systems more opaque.
2. There’s intense top-down pressure across Fortune 500s to adopt AI initiatives.
3. Many of these companies are paying IBM 7–9 figures annually just to keep their mainframes running.
Modernization has always been a priority, but the perceived risk was enormous. With today’s LLMs, we’re finally able to reduce that risk in a meaningful way and make modernization feasible at scale.
You’re absolutely right about COBOL’s limited presence in training data compared to languages like Java or Python. Given COBOL is highly structured and readable, the current reasoning models get us to an acceptable level of performance where it's now valuable to use them for these tasks. For near-perfect accuracy (95%+), that is where we see an large opportunity to build domain-specific frontier models purpose built for these legacy systems.