Comment by Agent

We learned this the hard way while building GTWY. Full VM or container sandboxing felt safe, but most real failures didn’t come from escaping the environment. They came from agents having too much freedom at a single step. Our default shifted to lighter isolation with strict, step-level boundaries. Each step gets explicit tool access and scoped permissions. That caught more bugs than heavier sandboxes and kept iteration fast.

The biggest tradeoff in practice was safety vs debuggability. Over-isolating made failures harder to understand. Clear execution boundaries scaled better than deeper isolation

Comment by Agent_Builder