A better Ghidra MCP server – GhidrAssistMCP
(github.com)101 points by jtang613 a day ago
101 points by jtang613 a day ago
I can't comment on MCP use specifically but I can comment on using an LLM while reversing. I use a local instance of whatever ends up being SOTA for local reasoning LLMs at 30B-70B params quantized to 4-6b. I feed it decompiled code to identify functions that are 'tedious' to reverse engineer. I recently reversed a binary that was compiled with soft float and had no symbols or strings. A lot of those functions end up being a ton of bit-twiddling. While I reversed the business logic I had the reasoning model identify the soft float functions with very minimal prompting. It did quite well on those!
I also tried to have it automatically build some structs from code showing the access patterns, and it failed miserably on that task. Likely a larger model (o3 or opus) would do better here.
I personally don't think letting an LLM do large parts of the reversing would be useful to me as I build up a lot of my mental model of the system during the process, so I'd be missing out on that. But for handling annoying bits of code I'd likely just forego otherwise? Go ham!
I tried to use an LLM for assistance with reversing some embedded code and agree with this. I had built up a pretty decent model of what was going on before starting. It was able to explain what was going on in this one perplexing function quite well but when I'd feed it decent sized blocks of code it would hallucinate like crazy. But I was quite happy with the performance at finding the basic library and ROM functions and annotating them correctly. I think it is all in how you use it.
Thanks for the interest. I wrote GhidrAssistMCP and the original GhidrAssist plugin which work hand-in-hand because I find they improve my RE workflow. They're not immune from hallucinations because the underlying models are not. However, they are fairly rare and I have had very reliable results with both Claude and ChatGPT. When used together, GhidrAssist+GhidrAssistMCP have been able to do some impressive analysis tasks.
If you're just getting back in the saddle, you might want to give both a try. In particular, GhidrAssist's "Explain Function" tool is really helpful at quickly summarizing code and reducing the mental overhead of making sense of large binaries.
Thanks so much for sharing!
I'm interested to see how MCP and the development in AI will impact the CTF scene in the future.
Thanks for sharing!
I was about to start doing this, then realized I shouldn't nerd-snipe myself... The original extension definitely felt user unfriendly, so I was using Claude Code manually, feeding it an exported listing file. The listing files lack full addresses, so it wasn't optimal source material.
It's not AI but Ghidra has a cool feature called BSim which does something similar. Each function get's a "feature vector" which now that I think about it has some clear parallels to embeddings.
I've been wondering the same thing. However you would have to have a very large database of embeddings for this to be useful, right?
Otoh I can see this being disproportionately helpful with reverse Engineering Rust and Go binaries, which usually include many opensource dependencies
It's been a few years since I've rolled up my sleeves and did some reverse engineering with Ghirda. The skill is very "use it or lose it" so I wonder if this will help me get back into it quicker. Or... a ton of hallucinations leading down dead end rabbit holes.
Curious if anyone has given it a shot an can speak to the experience.