Comment by bcoates
Also the persuasion paper he links isn't at all about what he's talking about.
That paper is about using persuasion prompts to overcome trained in "safety" refusals, not to improve prompt conformance.
Also the persuasion paper he links isn't at all about what he's talking about.
That paper is about using persuasion prompts to overcome trained in "safety" refusals, not to improve prompt conformance.
Co-Author of the paper here. We don't know exactly why modern llms don't want to call you a jerk, or for that matter why persuasive techniques convince them otherwise. it's not a hard line like many of the guardrails. That said, I talked to Jesse about this, and I strongly suspect the same techniques will work for prompt conformance when the topic is something other than name calling.