Comment by jsheard
It seems plausible that stressing the importance of the system prompt instructions might do something, but I don't see how telling the model not to hallucinate would work. How could the model know that its most likely prediction has gone off the rails, without any external point of reference?
Internally, LLMs know a whole lot more about the truth and uncertainty of their prediction than the say. Pushing that to words is difficult but not impossible.
https://news.ycombinator.com/item?id=41504226