Comment by simonw
Comment by simonw 19 hours ago
I was hoping for a moment that this meant they had come up with a design that was safe against lethal trifecta / prompt injection attacks, maybe by running everything in a tight sandbox and shutting down any exfiltration vectors that could be used by a malicious prompt attack to steal data.
Sadly they haven't completely solved that yet. Instead their help page at https://support.claude.com/en/articles/13364135-using-cowork... tells users "Avoid granting access to local files with sensitive information, like financial documents" and "Monitor Claude for suspicious actions that may indicate prompt injection".
(I don't think it's fair to ask non-technical users to look out for "suspicious actions that may indicate prompt injection" personally!)
Worth calling out that execution runs in a full virtual machine with only user-selected folders mounted in. CC itself runs, if the user set network rules, with https://github.com/anthropic-experimental/sandbox-runtime.
There is much more to do - and our docs reflect how early this is - but we're investing in making progress towards something that's "safe".