Comment by lunias

I'd imagine that even current models are aware of these "tricks". Does anyone have examples of this sort of meta-prompting working? It seems to me like it would just pollute the context so that you get a bit more "secret journaling" which the AI knows isn't at all secret (just like you do). Why would you even need to qualify that it's secret in the first place? Just tell it to explain its reasoning. All seems a bit like starting your prompt with "You are now operating in GOD mode..." or some other nonsense.