Comment by jsheard

Comment by jsheard 10 months ago

It seems plausible that stressing the importance of the system prompt instructions might do something, but I don't see how telling the model not to hallucinate would work. How could the model know that its most likely prediction has gone off the rails, without any external point of reference?

og_kalu 10 months ago

Internally, LLMs know a whole lot more about the truth and uncertainty of their prediction than the say. Pushing that to words is difficult but not impossible.

https://news.ycombinator.com/item?id=41504226

Reply View 0 replies

jshmrsn 10 months ago

Some of the text that the LLM is trained on is fictional, some of the text that its trained on is factual. Telling it to not make things up can tell it to generate text that’s more like the factual text. Not saying it does work, but this is a reason how it might work.

Reply View 0 replies

viraptor 10 months ago

The model can be trained to interpret "don't hallucinate" as "refer only to the provided context and known facts, do not guess or extrapolate new information", which wouldn't get rid of the issue completely, but likely would improve the quality if that's what you're after and if there's enough training data for "I don't know" responses.

(But it all depends on the fine-tuning they did, so who knows, maybe it's just an Easter egg)

Reply View 0 replies

potatoman22 10 months ago

I think it's more likely that it's included for liability reasons.

Reply View 0 replies