Comment by schrodinger
Comment by schrodinger 2 days ago
Genuinely curious — how did you isolate the effect of comments/context on model performance from all the other variables that change between sessions (prompt phrasing, model variance, etc)? In other words, how did you validate the hypothesis that "turning off the comments" (assuming you mean stripping them temporarily...) resulted in an objectively superior experience?
What did your comparison process look like? It feels intuitively accurate and validates my anecdotal impression but I'd love to hear the rigor behind your conclusions!
I was already in the habit of copy pasting relevent code sections to maximize reasoning performance to squeeze earlier weaker models performance on stubborn problems. (Still do this on really nasty ones)
It's also easy to notice LLMs create garbage comments that get worse over time. I started deleting all comments manually alongside manual snippet selection to get max performance.
Then started just routinely deleting all comments pre big problem solving session. Was doing it enough to build some automation.
Maybe high quality human comments improve ability? Hard to test in a hybrid code base.