Comment by languid-photic
Comment by languid-photic 4 days ago
Yes, typically the agent is just trying to do what it's been instructed to do, but sometimes it's too naive to realize its approach is a bit sketchy.
And actually, one way we've hardened our sandbox is by tasking agents with impossible tasks (within the sandbox), then analyzing and patching each workaround.