Comment by jackblemming

Comment by jackblemming 3 days ago

Seems cute, but ultimately not very valuable without benchmarks or some kind of evaluation. For all I know, this could make Claude worse.

jelling 3 days ago

Same. We've all fooled ourselves into believing that an LLM / stochastic process was finally solved based on a good result. But the sample size is always to low to be meaningful.

Reply View 0 replies

anuramat 3 days ago

even if it works as described, I'm assuming it's extremely model dependent (eg book prerequisites), so you'd have to re-run this for every model you use, this is basically poor man's finetuning;

maybe explicit support from providers would make it feasible?

Reply View 0 replies