Comment by tptacek
I had a conversation in a chat room yesterday about AI-assisted math tutoring where a skeptic said that the ability of GPT5 to effortlessly solve quotient differentials or partial fraction decomposition or rational inequalities wasn't indicative of LLM improvements, but rather just represented the LLMs driving CAS tools and thus didn't count.
As a math student, I can't possibly care less about that distinction; either way, I paste in a worked problem solution and ask for a critique, and either way I get a valid output like "no dummy multiply cos into the tan before differentiating rather than using the product rule". Prior to LLMs, there was no tool that had that UX.
In the same way: LLMs are probably mostly not off the top of their "heads" (giant stacks of weight matrices) axiomatically deriving vulnerabilities, but rather just doing a very thorough job of applying existing program analysis tools, assembling and parallel-evaluating large numbers of hypothesis, and then filtering them out. My interlocutor in the math discussion would say that's just tool calls, and doesn't count. But if you're a vulnerability researcher, it doesn't matter: that's a DX that didn't exist last year.
As anyone who has ever been staffed on a project triaging SAST tool outputs before would attest: it extremely didn't exist.
I don't care if it counts as true LLM brilliance or not.
If it doesn't matter if it's AI or not, just that they're good tools, why even advertise the AI keyword all over it? Just say "best in class security analysis toolset". It's proprietary anyway, you can't know how much of it is actually AI (unless you reproduce its results, which is the core argument you missed here).