OpenAI Agents SDK Adds Native Sandboxes and File Tools

OpenAI pushed a substantial rework of its Agents SDK this week, adding native sandbox execution and a model-native harness that lets agents read files, run shell commands, install dependencies, and edit code on their own. The update landed mid-April with an announcement post on the company's site and a beta release of Sandbox Agents. Python developers get it now. TypeScript developers wait.

A harness that looks suspiciously like Codex

The new harness bundles configurable memory, sandbox-aware orchestration, and filesystem tools the company openly describes as Codex-style. That's the tell. For months the gap between what Codex could do internally and what external developers could wire up with the SDK has been obvious to anyone who tried shipping a production agent. The SDK was orchestration glue. Everything interesting happened elsewhere.

Now the primitives come packaged. Tool use via MCP, progressive disclosure through Skills, custom instructions loaded from an AGENTS.md file, shell execution, apply_patch for file edits. If you've used Codex in the last six months, most of this will feel familiar. The point of the update is that you no longer have to rebuild it.

Code mode and subagents are listed as in development for both Python and TypeScript, which tells you where this is heading: OpenAI isn't done pulling agent-framework scaffolding in-house.

Seven providers, one abstraction

The sandbox story is more interesting than the harness story, honestly. Built-in support covers Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, and Vercel. Bring your own if none of those fit. Holding it all together is a new Manifest abstraction that describes an agent's workspace: mounts, output directories, environment, Git repos, local files, and cloud storage hooks for AWS S3, Google Cloud Storage, Azure Blob Storage, and Cloudflare R2.

What this means in practice: the same workspace definition that runs against a local Unix sandbox on your laptop should, in theory, run unmodified against E2B or Daytona in production. The company's pitch is that they've taken the execution-layer plumbing every agent framework ends up writing and generalized it. Whether the abstraction survives contact with real workloads is a different question. Nobody can answer it yet.

Min Chen, Chief AI Officer at LexisNexis, shows up in the announcement with a testimonial about legal workflows.

"This allows teams to focus on developing high-value, long-running legal agents rather than building agent infrastructure from scratch," Chen said

which is the marketing-pager quote you'd expect from a launch partner. More instructive is what's missing: no independent benchmarks, no cross-provider latency numbers, no cost comparisons. The claim that the harness "improves reliability for long-running or coordinated tasks" is unsourced.

The security pitch (this part I buy)

The harness and the compute layer run separately. Credentials your agent uses for orchestration don't sit inside the environment where model-generated code runs. That's a real win against prompt injection, and it's the kind of architectural decision that's hard to retrofit later.

Agent state lives outside the sandbox too. If a container dies, the runner spins up a new one and resumes from the last checkpoint using built-in snapshotting. Subagents can execute in parallel across isolated containers. Fan-out is straightforward, at least on paper.

What's missing

Python-only at launch is the thing to note. The JavaScript/TypeScript version of the SDK has been lagging for a while now, and this update widens the gap. OpenAI says TypeScript support for the new harness and sandbox features is coming "in a future release," which in practice usually means months.

Pricing detail is thin, too. The company says standard API pricing applies, based on tokens and tool usage. Sandbox compute isn't free and neither is snapshot storage. Billing for that runs through whichever provider you pick, which makes cost estimation on multi-day workflows a moving target. Expect surprise bills.

Karan Sharma of OpenAI's product team framed the update as "taking our existing agents SDK and making it so it's compatible with all of these sandbox providers." Fair description, if understated. The SDK is also quietly becoming a lot more opinionated about how agents should be built, and what primitives you're expected to use.

The full GitHub repo and the beta release notes document the new SandboxAgent class, the Manifest schema, and the RunState serialization details. Worth a read before you port anything serious.

OpenAI Overhauls Agents SDK With Native Sandboxes and Codex-Style File Tools

A harness that looks suspiciously like Codex

Seven providers, one abstraction

The security pitch (this part I buy)

What's missing

Oliver Senti

Related Articles

xAI Open-Sources Grok Build After Code-Upload Privacy Scandal

Tencent Hunyuan Releases PhoneBuddy Phone-Use Agent Models

Z.ai Launches ZCode 3.0 IDE Built Around GLM-5.2

Stay Ahead of the AI Curve