Comment by worldsayshi
Comment by worldsayshi a day ago
Yes as a rule an LLM should never be given access to information that it is not expected to share.
Although it would still be interesting to know if they could hold on to secrets, even if they never should need to.
I'm not sure that's right. You can write prompts that make use of, but don't disclose secret information.
I have valid live, production cases where we do this and don't have info-leaking problems due to the scaffolding / prompting techniques we use.
Part of the problem is that confidentiality is in the eye of the beholder so extra effort needs to be taken to make explicit what should and should not be shared.
That said, one valid conclusion that could be drawn from this research is that base models are currently insufficient at exercising nuanced judgment about what should and should not be disclosed without explicit instruction.
That's an interesting thing to know and would be a good place for model builders to put some effort.