The Problem With Full Access
Every AI agent that can run commands on your machine has full access to your machine. The agent that helps you write code can also, if something goes wrong, delete files, modify system settings, or run processes you did not ask for.
Docker sandboxes change this. They give the agent a contained environment to work in , a virtual machine with access to exactly what it needs and nothing more. Actions taken inside the sandbox do not affect the host system unless you explicitly let them through.
It is the same principle as a physical sandbox: the agent can build and destroy things freely inside the box. Outside the box, nothing happens.
How It Works in Practice
The setup is a single command: sbx run codex. The sandbox spins up a container, pulls the agent image, and starts running in what the documentation calls YOLO mode , full execution without approval prompts for every action, which keeps it fast. The difference from running directly on your host is that the blast radius of any mistake is bounded to the container.
Something interesting happened in one demo: on the first message, the sandbox threw a warning. "Falling back from websockets to HTTPS , transport stream disconnected , attack attempt detected." The sandbox had flagged something as suspicious and switched to a safer transport method automatically. The agent kept working. The security layer caught something without interrupting the workflow.
That is exactly what good sandboxing looks like. The agent does not know the containment is there. You do not have to manage it actively. It just runs in the background and handles anomalies when they happen.
Why This Matters for the Current Moment
The conversation around AI agent security has been dominated by credential exposure and permission architecture , the mistakes that happen before the agent runs. Docker sandboxing addresses a different category of risk: what happens during execution when something goes wrong or goes unexpected.
An agent operating on real production code with real credentials can cause real damage if it makes a wrong assumption at step twelve of a twenty-step process. The damage is proportional to the access. The sandbox limits the access. The agent can still do the work. The scope of a mistake is contained.
For teams shipping agents into any environment where errors have consequences , code review, file management, API calls, customer data , sandboxing is the execution-layer equivalent of the read-only / write-scoped permission architecture at the access layer.
The Setup Is One Command
That is the part worth emphasizing. This is not a complex infrastructure change. It is not a new service to maintain or a new security policy to enforce across a team.
You run sbx run codex from your project directory. The container starts. The agent runs inside it. When you are done, the container stops. Nothing persistent. No residual changes to your host environment.
The security comes from the isolation, which is provided by Docker's containerization. You do not have to build it. You do not have to configure much. You just have to decide to run the agent in the sandbox instead of directly on your machine.
Most developers have not made that decision yet.
Most developers should.