Sandboxing

Q: Which tools use Sandboxing?

Codegen, Devin are among the tools that incorporate Sandboxing capabilities. See the related tools section for details.

Running AI agent code in an isolated environment that cannot affect live systems, enabling safe parallel execution.

In the context of AI coding agents, sandboxing refers to running agent code in an isolated execution environment that cannot affect live systems or other agent runs. A sandboxed agent has its own filesystem, process space, and network boundaries.

Sandboxing is essential for production use of coding agents. Without isolation, an agent that makes a mistake can corrupt the codebase, interfere with other running agents, or accidentally modify production data. Sandboxed environments also enable parallel agent execution, where multiple agents work on different tasks simultaneously without collision.

Codegen and Devin both run agents in sandboxed environments. Most IDE-based tools like Cursor and Copilot do not sandbox their agent execution since they operate directly in the developer’s local environment.

In plain English

Running AI-generated code in an isolated environment so that if something breaks, it breaks there and not in your actual codebase or production systems.

Why it matters

AI agents make mistakes. A sandboxed environment means those mistakes are observable, correctable, and contained. Without sandboxing, an agent that writes a destructive query or an infinite loop affects real data. With sandboxing, the failure is caught inside the test environment and the agent can iterate until it gets it right.

In practice

An agent is assigned to migrate a database schema. It spins up in an isolated container with a snapshot of the test database, runs the migration, and finds that one of its ALTER TABLE statements locks a table in a way that causes timeouts. The agent revises its approach and reruns — three times before the migration completes cleanly. The production database was never at risk.

How Codegen uses Sandboxing

Every Codegen agent session runs in a process-isolated sandboxed environment. This is infrastructure, not a configurable option. It enables reproducible runs — the same task produces the same execution trace — and per-task cost tracking, because usage is metered per sandboxed session. It also underpins Codegen's SOC 2 compliance posture: auditors can verify that agent activity cannot affect production systems without a human merging a pull request.

Frequently Asked Questions

What is Sandboxing?

Why does Sandboxing matter?

What does Sandboxing look like in practice?

Which tools use Sandboxing?