← Back to blog Research

Claude Code vs OpenAI Codex vs Cursor: A Security Team’s Guide to Their Permission Models (2026)

Anomity Research Security Researcher, Anomity · Jun 12, 2026 · 8 min read

Claude Code, OpenAI Codex, and Cursor all ship a coding agent that can edit files, run shell commands, and reach the network, and all three put a permission layer in front of those actions. The layers are not the same shape. Claude Code evaluates ordered allow/ask/deny rules; Codex pairs an OS-level sandbox with an approval policy; Cursor runs an allowlist plus an LLM classifier and a sandbox. If you own the security of engineers using more than one, the differences decide what an admin can actually enforce versus what is left to each developer's local config. This is a comparison of those three models as documented, and a companion to our pillar on securing AI coding agents and CLIs.

Every claim here is grounded in the official docs: Claude Code's permissions page, OpenAI's Codex agent approvals and security docs, and Cursor's agent security and terminal docs. We have walked each model on its own - how Claude Code permissions work, the OpenAI Codex sandbox and approval model, and the limits of Cursor Auto-Run and YOLO allow/deny. This post sets them side by side so you can reason about a mixed fleet, and shows where the fleet inventory you need sits above all three.

Default posture: what runs before anyone says yes

Start with the out-of-the-box behavior, because that is what most engineers actually run. Claude Code uses a tiered default: read-only tools like file reads and Grep run without a prompt, while Bash commands and file modifications require approval on first use. A built-in set of read-only Bash commands - ls, cat, grep, find, read-only git - also runs without prompting in every mode, and the set is not configurable.

Codex's default for a version-controlled folder is the Auto mode, which maps to workspace-write: Codex can read, edit, and run commands inside the workspace automatically, and asks for approval only to edit files outside the workspace or run a command that needs network access. Network is disabled by default, and protected paths such as .git, .agents, and .codex stay read-only regardless of mode. Cursor's default is Auto-review: reading and code search never prompt, file modifications skip approval except for configuration files, and terminal commands require approval unless you add them to your allowlist. The honest one-line summary: all three read freely, all three protect their own config directory, and the real divergence is in how a command earns the right to run.

How you allow and deny: rules, policy, allowlist

This is where the three models stop rhyming. Claude Code is rule-based and the order is the whole point: rules are evaluated deny, then ask, then allow, first match wins regardless of specificity - a matching ask rule prompts even when a more specific allow also matches. Deny is absolute: a tool denied at any settings level cannot be re-allowed by another. Rules use Tool or Tool(specifier) syntax with glob wildcards, and the docs are blunt that argument-constraining Bash patterns are fragile - Bash(curl example.com *) does not survive a flag before the URL, a protocol switch, a redirect, or a variable holding the URL. We dug into that brittleness in how Claude Code permissions work.

Codex separates the two questions Claude Code merges. Sandbox mode (read-only, workspace-write, or full access) sets technical capability; the approval policy (untrusted, on-request, or never) sets when consent is required. untrusted runs only known-safe read operations and asks for state-mutating commands; on-request prompts on escalation; never issues no prompts but still respects sandbox constraints. Full access is the --dangerously-bypass-approvals-and-sandbox flag, marked elevated risk because it disables both layers at once - the trust boundary we examined in the Codex full-access trust boundary.

Cursor leans on an allowlist and, increasingly, a classifier. Commands on the allowlist skip sandbox restrictions and run immediately; anything else, under Auto-review, runs in the sandbox when possible or goes to an LLM classifier that decides allow or block. Cursor's docs state the team is deprecating the older denylist in favor of the allowlist, and that all run modes are best-effort with bypasses possible. MCP tools follow their own path: a third-party server needs initial approval, then per-tool-call approval, unless you pre-approve specific tools with an MCP allowlist. That trust step is exactly what failed in the Cursor MCPoison MCP trust bypass, CVE-2025-54136.

Sandbox and extensibility: OS enforcement vs hooks

Codex is the only one of the three whose sandbox is OS-level by design: Seatbelt with sandbox-exec on macOS, bwrap plus seccomp on Linux, and a native or WSL2 sandbox on Windows, with network disabled and writes scoped to the workspace. Admins can enable a network_proxy to constrain egress via domain allowlists. Cursor also offers a sandbox - read and write inside workspace directories, no network by default, the .cursor config directory protected regardless of allowlist - but documents it as best-effort. The boundary is not foolproof: the Cursor git-hooks sandbox escape RCE, CVE-2026-26268, is a reminder that a sandbox edge case is an escape, not just a warning.

Claude Code's extensibility story is different. It offers an optional OS-level sandbox for Bash, but its distinctive primitive is the PreToolUse hook: a shell command that runs before the permission prompt on every matching tool call and can return allow, deny, or ask after inspecting the resolved command. The hook cannot loosen the rules - a matching deny or ask still wins - but a hook that exits with code 2 stops a call before the rules are even evaluated. That programmable checkpoint is more expressive than a static allowlist, and it is the same surface attackers target when a local config is poisoned, as in the Claude Code project-file RCE and token exfiltration, CVE-2025-59536.

What an admin can actually enforce

For a security team the operative question is not the local UX, it is what survives once the laptop leaves your hands. Claude Code has the most developed answer: managed settings that cannot be overridden by user, project, or command-line scope, delivered via MDM or server-managed settings. An admin can deny tools, force ask rules, set disableBypassPermissionsMode to disable, and use allowManagedPermissionRulesOnly so only managed rules apply. Codex offers enterprise Managed configuration for workspace-wide sandbox, approval, network, and reviewer policy. Cursor offers team and enterprise controls - SSO, SCIM, LLM safety policies, compliance logging, MDM, and model blocklists - with run modes and allowlists configurable in settings or permissions.json. The table below lines them up.

Dimension	Claude Code	OpenAI Codex	Cursor
Default posture	Read-only auto; Bash and edits prompt on first use	Auto = workspace-write; network off; prompts outside workspace	Auto-review; reads free; terminal prompts unless allowlisted
Allow / deny model	Ordered rules: deny → ask → allow, deny absolute	Sandbox mode + approval policy (untrusted/on-request/never)	Allowlist + LLM classifier; denylist deprecated
Sandbox	Optional OS sandbox for Bash	OS-level (Seatbelt / bwrap+seccomp / WSL2); network off	Best-effort workspace sandbox; .cursor protected
Extensibility checkpoint	PreToolUse hook (allow/deny/ask, code-2 blocks)	Optional auto-review reviewer agent	LLM classifier under Auto-review
Admin enforcement	Managed settings (MDM / server), non-overridable	Enterprise Managed configuration	Team/enterprise: SSO, SCIM, policies, model blocklist, MDM
MCP trust	mcp__ rules; deny mcp__* removes all	Configured via approval policy / managed config	Per-server then per-tool approval; MCP allowlist

What to check on a mixed fleet

The matrix above is useful for procurement, but the operational risk is that the strongest control in each column only helps if it was deployed. Managed settings only constrain a machine that received them. A Codex enterprise policy only binds an org-managed install. A Cursor run-mode policy only applies where the team enforced it. The defaults are permissive enough - workspace-write with no prompts, Auto-review running allowlisted commands immediately - that an unmanaged install is an open door, and the primitive an admin can lock down is the one an attacker abuses when left local. We have written up both shapes: the OpenAI Codex branch-name command injection and GitHub token theft, and the Cursor CurXecute MCP prompt-injection RCE, CVE-2025-54135.

So the practitioner checklist is the same across all three even though the knobs differ: which install is managed and which is not; the effective default mode or approval policy per machine; whether a deny rule, allowlist, or block instruction is actually in force or just present in someone's intent; whether MCP servers were trusted by hand and stayed trusted after the config changed; and whether any of it is recorded where a security team can query it later. Static configuration in three different formats does not answer those questions - the broader theme of securing AI coding agents and CLIs and of the comment-and-control multi-agent prompt injection and credential theft case, where a poisoned input crossed a boundary no single tool's allowlist was watching. The same per-endpoint trust assumption underlies the Claude Code GitHub Action bot-actor bypass and the Anthropic MCP stdio-by-design RCE.

The layer above all three

None of these models is wrong, and none is the problem. Each gives a developer real, documented control on their own machine. What none gives a security team is a single view across machines and vendors: which endpoints run Claude Code, Codex, or Cursor; what mode, approval policy, or run mode each is in; which deny rules, allowlists, and MCP trust decisions are actually in effect; and a record of what every agent did. This is the layer Anomity inventories. The Endpoint Sensor discovers all three agents alongside the other AI artifacts on each managed machine, classifies them, and where an agent exposes a hook such as Claude Code's PreToolUse, returns an allow, deny, or log decision on each tool call before it runs - metadata only, with secret redaction on the endpoint - writing every decision to a queryable 90-day audit trail and routing it to your SIEM, Slack, or Jira. It complements each tool's enforcement rather than replacing it. If you are governing more than one coding agent and want that fleet-wide picture instead of three config formats, request early access.