Comment by tcdent

Comment by tcdent 3 days ago

This style of prompting, where you set up a dire scenario in order to try to evoke some "emotional" response from the agent, is already dated. At some point, putting words like IMPORTANT in all uppercase had some measurable impact, but at the present time, models just follow instructions.

Save yourself the experience of having to write and maintain prompts like this.

bcoates 3 days ago

Also the persuasion paper he links isn't at all about what he's talking about.

That paper is about using persuasion prompts to overcome trained in "safety" refusals, not to improve prompt conformance.

Reply View 3 replies

danshapiro 2 days ago

Co-Author of the paper here. We don't know exactly why modern llms don't want to call you a jerk, or for that matter why persuasive techniques convince them otherwise. it's not a hard line like many of the guardrails. That said, I talked to Jesse about this, and I strongly suspect the same techniques will work for prompt conformance when the topic is something other than name calling.

Reply View | 2 replies
- diamond559 2 days ago
  
  It's bc they are programmed to be agreeable and friendly so that you'll keep using them.
  
  Reply View | 0 replies
- make3 2 days ago
  
  isn't that just instruction fine tuning and rlhf inducing style & deference? why is that surprising
  
  Reply View | 0 replies

kasey_junk 3 days ago

What’s irritating is that the llms haven’t learned this as bout themselves yet. If you ask an llm to improve its instructions those sort of improvements are what it will suggest.

It is the thing I find most irritating about working with llms and agents. They seem forever a generation behind in capabilities that are self referential.

Reply View 5 replies

danielbln 3 days ago

LLMs will also happily put time estimates on work packages that are based on ore-LLM turn around times.
"Phase 2 will take about one week"
No, Claude, it won't, because you you and I will bang this thing out in a few hours.

Reply View | 3 replies
- mceachen 3 days ago
  
  "Refrain from including estimated task completion times." has been in my ~/.claude/CLAUDE.md for a while. It helps.
  
  Reply View | 2 replies
  
  no-name-here 2 days ago
  
  Do such instructions take up a tiny bit more attention/context from LLMs, and consequentially is it better to leave it off and just ignore such output?
  
  Reply View | 1 reply
  
  mceachen 2 days ago
  
  I have to balance this with what I know about my reptile brain. It’s distracting to me when Claude declares that I’m “absolutely right!” or making a “brilliant insight,” so it’s worth it to me to spend the couple context tokens and tell them to avoid these cliches.
  (The latest Claude has a `/context` command that’s great at measuring this stuff btw)
  
  Reply View | 0 replies
conorcleary 2 days ago

Comments like yours on posts like these by humans like us will create a philosophical lens out of the ether that future LLMs will harvest for free and then paywall.

Reply View | 0 replies