Comment by axoltl

Comment by axoltl 4 days ago

2 replies

I can't comment on MCP use specifically but I can comment on using an LLM while reversing. I use a local instance of whatever ends up being SOTA for local reasoning LLMs at 30B-70B params quantized to 4-6b. I feed it decompiled code to identify functions that are 'tedious' to reverse engineer. I recently reversed a binary that was compiled with soft float and had no symbols or strings. A lot of those functions end up being a ton of bit-twiddling. While I reversed the business logic I had the reasoning model identify the soft float functions with very minimal prompting. It did quite well on those!

I also tried to have it automatically build some structs from code showing the access patterns, and it failed miserably on that task. Likely a larger model (o3 or opus) would do better here.

I personally don't think letting an LLM do large parts of the reversing would be useful to me as I build up a lot of my mental model of the system during the process, so I'd be missing out on that. But for handling annoying bits of code I'd likely just forego otherwise? Go ham!

segmondy 4 days ago

You hit the target on what most miss about LLMs, part of work is building up a lot of mental model of the system you are working on. When LLM does the work, it becomes easy to miss that mental model.

  • jhart99 4 days ago

    I tried to use an LLM for assistance with reversing some embedded code and agree with this. I had built up a pretty decent model of what was going on before starting. It was able to explain what was going on in this one perplexing function quite well but when I'd feed it decent sized blocks of code it would hallucinate like crazy. But I was quite happy with the performance at finding the basic library and ROM functions and annotating them correctly. I think it is all in how you use it.