Comment by mmastrac

Comment by mmastrac 4 days ago

1 reply

I tried it on a low-complexity Rust PR I worked on a few months back and it did a pretty good job. I'd probably change where the highlights live (for example x.y.z() -> x.w.z() should highlight y/w in a lot of cases).

For the most part, it seems to draw the eye to the general area where you need to look closer. It found a near-invisible typo in a coworker's PR which was kind of interesting as well.

https://0github.com/geldata/gel-rust/pull/530

It seems to flag _some_ deletions as needing attention, but I feel like a lot of them are ignored.

Is this using some sort of measure of distance between the expected token in this position vs the actual token?

EDIT: Oh, I guess it's just an LLM prompt? I would be interested to see an approach where the expected token vs actual token generates a heatmap.

lawrencechen 4 days ago

Happy to hear!

> Is this using some sort of measure of distance between the expected token in this position vs the actual token?

The main implementation is in this file: https://github.com/manaflow-ai/cmux/blob/main/apps/www/lib/s...

EDIT: yeah it's just a LLM prompt haha

Just a simple prompt right now, but I think we could try an approach where we directly see which tokens might be hallucinated. Gonna try to find the paper for this idea. Might be kinda analogous to the "distance between the expected token in this position vs the actual token."