Comment by simonw
This has become especially true for me in the past four months. The new long context reasoning models are shockingly good at digging through larger volumes of gnarly code. o3, o4-mini and Claude 3.7 Sonnet "thinking" all have 200,000 token context limits, and Gemini 2.5 Pro and Flash can do 1,000,000. As "reasoning" models they are much better suited to following the chain of a program to figure out the source of an obscure bug.
Makes me wonder how many of the people who continue to argue that LLMs can't help with large existing codebases are missing that you need to selectively copy the right chunks of that code into the model to get good results.
But 1 million tokens is like 50k lines of code or something. That's only medium sized. How does that help with large complex codebases?
What tools are you guys using? Are there none that can interactively probe the project in a way that a human would, e.g. use code intelligence to go-to-definition, find all references and so on?