Agents

OpenAI Overhauls Agents SDK With Native Sandboxes and Codex-Style File Tools

The update ships filesystem access, seven sandbox providers, and Codex-flavored primitives. Python first; TypeScript waits.

Oliver Senti
Oliver SentiSenior AI Editor
April 19, 20264 min read
Share:
Abstract illustration of isolated compute containers connected to an AI orchestration layer, representing sandbox architecture

OpenAI pushed a substantial rework of its Agents SDK this week, adding native sandbox execution and a model-native harness that lets agents read files, run shell commands, install dependencies, and edit code on their own. The update landed mid-April with an announcement post on the company's site and a beta release of Sandbox Agents. Python developers get it now. TypeScript developers wait.

A harness that looks suspiciously like Codex

The new harness bundles configurable memory, sandbox-aware orchestration, and filesystem tools the company openly describes as Codex-style. That's the tell. For months the gap between what Codex could do internally and what external developers could wire up with the SDK has been obvious to anyone who tried shipping a production agent. The SDK was orchestration glue. Everything interesting happened elsewhere.

Now the primitives come packaged. Tool use via MCP, progressive disclosure through Skills, custom instructions loaded from an AGENTS.md file, shell execution, apply_patch for file edits. If you've used Codex in the last six months, most of this will feel familiar. The point of the update is that you no longer have to rebuild it.

Code mode and subagents are listed as in development for both Python and TypeScript, which tells you where this is heading: OpenAI isn't done pulling agent-framework scaffolding in-house.

Seven providers, one abstraction

The sandbox story is more interesting than the harness story, honestly. Built-in support covers Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, and Vercel. Bring your own if none of those fit. Holding it all together is a new Manifest abstraction that describes an agent's workspace: mounts, output directories, environment, Git repos, local files, and cloud storage hooks for AWS S3, Google Cloud Storage, Azure Blob Storage, and Cloudflare R2.

What this means in practice: the same workspace definition that runs against a local Unix sandbox on your laptop should, in theory, run unmodified against E2B or Daytona in production. The company's pitch is that they've taken the execution-layer plumbing every agent framework ends up writing and generalized it. Whether the abstraction survives contact with real workloads is a different question. Nobody can answer it yet.

Min Chen, Chief AI Officer at LexisNexis, shows up in the announcement with a testimonial about legal workflows.

"This allows teams to focus on developing high-value, long-running legal agents rather than building agent infrastructure from scratch," Chen said
which is the marketing-pager quote you'd expect from a launch partner. More instructive is what's missing: no independent benchmarks, no cross-provider latency numbers, no cost comparisons. The claim that the harness "improves reliability for long-running or coordinated tasks" is unsourced.

The security pitch (this part I buy)

The harness and the compute layer run separately. Credentials your agent uses for orchestration don't sit inside the environment where model-generated code runs. That's a real win against prompt injection, and it's the kind of architectural decision that's hard to retrofit later.

Agent state lives outside the sandbox too. If a container dies, the runner spins up a new one and resumes from the last checkpoint using built-in snapshotting. Subagents can execute in parallel across isolated containers. Fan-out is straightforward, at least on paper.

What's missing

Python-only at launch is the thing to note. The JavaScript/TypeScript version of the SDK has been lagging for a while now, and this update widens the gap. OpenAI says TypeScript support for the new harness and sandbox features is coming "in a future release," which in practice usually means months.

Pricing detail is thin, too. The company says standard API pricing applies, based on tokens and tool usage. Sandbox compute isn't free and neither is snapshot storage. Billing for that runs through whichever provider you pick, which makes cost estimation on multi-day workflows a moving target. Expect surprise bills.

Karan Sharma of OpenAI's product team framed the update as "taking our existing agents SDK and making it so it's compatible with all of these sandbox providers." Fair description, if understated. The SDK is also quietly becoming a lot more opinionated about how agents should be built, and what primitives you're expected to use.

The full GitHub repo and the beta release notes document the new SandboxAgent class, the Manifest schema, and the RunState serialization details. Worth a read before you port anything serious.

Tags:OpenAIAgents SDKAI agentssandbox executiondeveloper toolsMCPCodexPython SDK
Oliver Senti

Oliver Senti

Senior AI Editor

Former software engineer turned tech writer, Oliver has spent the last five years tracking the AI landscape. He brings a practitioner's eye to the hype cycles and genuine innovations defining the field, helping readers separate signal from noise.

Related Articles

Stay Ahead of the AI Curve

Get the latest AI news, reviews, and deals delivered straight to your inbox. Join 100,000+ AI enthusiasts.

By subscribing, you agree to our Privacy Policy. Unsubscribe anytime.

OpenAI Agents SDK Adds Native Sandboxes and File Tools | aiHola