Do I need a paid plan to use Codex?

Yes. As of June 2026 Codex is not on the Free plan; Plus ($20/month) is the entry point, Go ($8) only covers lightweight tasks, and Pro ($100 or $200) raises the usage ceiling substantially.

What model does Codex run?

GPT-5.5 by default since late April 2026, with GPT-5.4, GPT-5.4-mini, and GPT-5.3-codex also selectable. Codex runs OpenAI models only.

Does the cloud sandbox have internet access?

Only during the setup phase. During the agent phase internet is off by default; you can enable limited or full access per environment if a task truly needs it.

Can it run my full test suite?

Yes, inside the sandbox — as long as dependencies install cleanly via the commands in your `AGENTS.md`, and your tests don't depend on a live secret (secrets are removed before the agent phase).

What happens if a task fails partway?

Codex usually opens a PR with whatever it completed plus a note about the failure. Read the failure log, tighten the spec, and re-run.

How is this different from Claude Code or Cursor?

Codex cloud is async; Cursor and Claude Code are interactive. Pick async when you want to fire off parallel work and walk away, interactive when you want a tight feedback loop.

AI Tool Tutorials

Codex Beginner Guide: Sandboxed Cloud Coding Without the Pitfalls (2026)

How OpenAI Codex runs sandboxed cloud and CLI coding tasks in 2026: which ChatGPT plan you need, the GPT-5.5 default, the task-spec format that produces reviewable PRs, and the mistakes that bite first-time users.

Published: May 17, 2026 Updated: Jun 04, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

TL;DR

Codex is OpenAI’s coding agent. It comes in two surfaces: Codex CLI, which runs in your terminal, and Codex cloud (web), which spins up an isolated sandbox, branches your repo, makes the changes, runs your tests, and opens a pull request while you do something else. As of June 2026 both run on GPT-5.5 by default. You need a paid ChatGPT plan — Plus at $20/month is the entry point; Codex is not included on Free, and Go ($8) only covers lightweight tasks. The single biggest beginner mistake is treating Codex like a chat box instead of writing a tight task spec with explicit constraints and acceptance criteria.

What Codex is, and what it isn’t

Codex is the “spin off a task, walk away, come back to a PR” model of agentic coding. The cloud surface runs each task in an OpenAI-managed container with its own copy of your repo, which is what makes parallel work possible — but it also creates a class of bugs first-timers don’t expect: environment drift, missing secrets, and the agent helpfully changing things you never asked it to touch.

Two facts about the sandbox shape everything else:

Internet access is split into two phases. During the setup phase (installing dependencies), the container can reach the network. During the agent phase, internet is off by default; you enable limited or unrestricted access per environment if a task genuinely needs it. (OpenAI: agent internet access)
Secrets only exist during setup. Any secret you configure for a cloud environment is available while the setup script runs, then removed before the agent phase starts. If your tests need a live API key at runtime, the sandbox won’t have it. (OpenAI: cloud environments)

Reach for Codex when the work is a self-contained, long-running edit you don’t need to watch land in real time: a 30-file refactor, a dependency upgrade, a test-coverage sweep, a documentation pass. A full-repo task like auditing a React Native project with AI is a natural fit.

Skip Codex when you need to see partial output and steer mid-flight, when the feedback loop is tight, when the task needs a runtime secret the sandbox strips, or for one-line tweaks where the setup cost outweighs the savings. For those, an interactive tool like Cursor or Claude Code is the better call.

Which plan and model you actually get (June 2026)

Codex usage moved to token/credit-based pricing on April 2, 2026 for Plus, Pro, Business, and new Enterprise plans — replacing the old per-message limits. What you get now is a per-window allowance measured in local messages, plus the option to buy extra credits at published per-million-token rates. (Codex pricing)

Plan	Price (June 2026)	Codex included?	GPT-5.5 allowance (per 5h window)
Free	$0	No	—
Go	$8/mo	Lightweight tasks only	—
Plus	$20/mo	Yes	~15–80 local messages
Pro (5x)	$100/mo	Yes	~80–400 local messages
Pro (20x)	$200/mo	Yes	~300–1,600 local messages
Business / Enterprise	Per seat	Yes	Enterprise uses flexible credits, no fixed cap

Ranges are bands published by OpenAI, not fixed numbers — actual usage depends on task size and reasoning effort. In practice, heavy Codex users report ~$100–$200 of usage per developer per month once they’re running multiple instances and fast mode.

On models: GPT-5.5 became the Codex default around April 23–24, 2026 and is the recommended pick for most coding, refactors, debugging, and test work. The picker also exposes gpt-5.4, the faster gpt-5.4-mini for subagents, and gpt-5.3-codex. (Codex models) Note that Codex runs OpenAI models only — if you want Anthropic models in a terminal agent, that’s Claude Code, not Codex.

Install the CLI (5 minutes)

If you want to drive Codex locally instead of (or alongside) the cloud, install the CLI. It needs Node.js 22 or later.

npm install -g @openai/codex
codex --version
codex

On first launch, choose Sign in with ChatGPT — this links your existing Plus/Pro/Business subscription via OAuth, so you don’t pay twice. (You can alternatively sign in with an OpenAI API key and pay per token.) Be careful with the package name: it’s @openai/codex, not the unrelated unscoped codex package from 2012. (npm: @openai/codex)

Before you start a cloud task

Confirm your repo’s CI is green on main. If you start from a failing baseline, Codex inherits it and you can’t tell its bugs from yours.
Add an AGENTS.md at the repo root. Codex reads it to find your build/lint/test commands and your “never do this” rules — this is the single highest-leverage setup step. (OpenAI: AGENTS.md)
Configure any setup-phase secrets your dependency install needs. Remember they vanish before the agent runs, so design tests that don’t need live credentials at runtime.
Decide branch naming and PR conventions up front so the agent’s output drops straight into your workflow.

Step by step

Sign in and connect your repo. If you don’t know the codebase well yet, run a quick AI codebase tour first so your task spec uses real file and function names instead of guesses.
Create a task with explicit acceptance criteria. A good starter task is using Codex to review your sitemap. Format the spec like a bug ticket plus a test:

Goal: Replace deprecated `request.json()` calls with `await request.json()` in `src/api/`.

Constraints:
- Do not modify any file outside `src/api/`.
- Preserve existing error-handling patterns.
- Use the project's existing async style (look at `src/api/auth.ts` for reference).

Acceptance:
- All existing tests pass.
- New test in `src/api/__tests__/json-parse.test.ts` covers the new behavior.
- No console warnings during `npm test`.

Let Codex work. It spins a sandbox, branches, edits, runs your tests, and opens a PR. Small scoped tasks typically finish in 5–15 minutes; full-repo sweeps run 30–60 minutes (OpenAI describes a single cloud task as running roughly 1–30 minutes of active agent time).
Review the PR like any human PR. Read every changed file, check the test that proves the change, and run the branch locally before you trust it.
Iterate via PR comments. Codex reads comments and pushes follow-up commits. Use this for small corrections, not for rewriting the goal — if the goal changed, open a fresh task.
Merge when satisfied. If you find a missed case after merge, file a new task rather than pushing a follow-up commit onto a merged agent branch.

Verify the output before you merge

The review pass is where Codex earns or loses your trust. Four checks catch most problems:

Did it respect the scope constraints? The diff’s file list answers “did it modify anything outside src/api/” in one glance.
Did every test pass, including ones you didn’t name? Scan the CI log for skipped tests — a green checkmark with three skips is not a pass.
Did it add “helpful” extras? Bonus refactors you didn’t request are the most common Codex failure mode. Revert them; they aren’t free.
Is the PR description accurate? Codex occasionally overstates what it did. Read the code, not the summary.

Common mistakes

Vague task descriptions like “fix the login bug.” The agent will pick a fix; it may not be yours.
No acceptance criteria. Without a concrete “done” test, the agent’s definition of done won’t match yours.
Skipping the codebase tour on an unfamiliar repo. The spec needs real names; without them the agent guesses, and the guesses are wrong.
Assuming the sandbox has your secrets at runtime. It doesn’t — they’re stripped after setup. Design around it.
Letting Codex touch CI config or secret handling. Scope those out explicitly in constraints.
Running parallel tasks on related files. Two agents editing adjacent code conflict in ways that are no fun to untangle.
Treating the PR as production-ready without review. Codex output is a PR-quality first draft, not a merge-without-reading.

FAQ

Do I need a paid plan to use Codex?: Yes. As of June 2026 Codex is not on the Free plan; Plus ($20/month) is the entry point, Go ($8) only covers lightweight tasks, and Pro ($100 or $200) raises the usage ceiling substantially.
What model does Codex run?: GPT-5.5 by default since late April 2026, with GPT-5.4, GPT-5.4-mini, and GPT-5.3-codex also selectable. Codex runs OpenAI models only.
Does the cloud sandbox have internet access?: Only during the setup phase. During the agent phase internet is off by default; you can enable limited or full access per environment if a task truly needs it.
Can it run my full test suite?: Yes, inside the sandbox — as long as dependencies install cleanly via the commands in your AGENTS.md, and your tests don’t depend on a live secret (secrets are removed before the agent phase).
What happens if a task fails partway?: Codex usually opens a PR with whatever it completed plus a note about the failure. Read the failure log, tighten the spec, and re-run.
How is this different from Claude Code or Cursor?: Codex cloud is async; Cursor and Claude Code are interactive. Pick async when you want to fire off parallel work and walk away, interactive when you want a tight feedback loop.

Tags: #AI coding #Tutorial #Codex

TL;DR

What Codex is, and what it isn’t

Which plan and model you actually get (June 2026)

Install the CLI (5 minutes)

Before you start a cloud task

Step by step

Verify the output before you merge

Common mistakes

FAQ

Related

Related Articles

AI Changelog Generation: From Commits to a Release Note Humans Read

AI-Assisted Database Migrations — Reversible, Backfilled, Tested

AI for Incident Postmortems Without Sanitizing the Lessons

AI Merge Conflict Resolution: When to Trust the Auto-Merge

AI On-Call Debugging: From Page to Fix Without Panic

AI PR Descriptions: From Diff to Reviewable