Comment by cubefox
I mean if the wife says to her husband: The traffic light is green. Then this may count as an instruction to get going. But usually declarative sentences aren't interpreted as instructions. And we are perfectly able to not interpret even text with imperative sentences (inside quotes or in films etc) as an instruction to _us._ I don't see why an LLM couldn't learn to likewise not execute explicit instructions inside quotes. It should be doable with SFT or RLHF.
The economic value associated with solving this problem right now is enormous. If you think you can do it I would very much encourage you to try!
Every intuition I have from following this space for the last three years is that there is no simple solution waiting to be discovered.