← Back to blog Guide

Securing AI Coding Agents and CLIs: The Complete Guide for Security Teams (2026)

Anomity Research Anomity Research · Jun 12, 2026 · 13 min read

TL;DR

84% of developers now use or plan to use AI coding tools (2025 Stack Overflow Developer Survey), yet 46% do not trust the output - the gap between adoption and assurance is the exposure you have to govern.
Prompt injection is OWASP LLM01 - the number-one LLM risk for the second consecutive edition. Almost every coding-agent compromise in this cluster starts as untrusted text the agent treated as instructions.
The coding agent and its CLI are an endpoint, not an app: the malicious code runs inside a trusted binary, so EDR, DLP and network tools see nothing wrong - the risk lives in the AI artifact layer they were never built to inventory.
The same patterns repeat across vendors - allowlist bypass, MCP trust bypass, YOLO/auto-approve modes, branch-name and Git-hook injection, and fake-installer supply chain - so a per-tool patch is necessary but never sufficient.
The durable control is inventory the agents and CLIs, decide allow/deny/log at the hook before each tool call runs, and keep a queryable 90-day audit trail - exactly what Anomity does on every managed endpoint.
No CVE means no scanner coverage: several findings here (Tracebit Gemini CLI, the Comment-and-Control campaign) shipped without a CVE, so a version inventory beats an advisory feed.

Every security team in 2026 faces the same decision: you can block AI coding agents and watch developers route around you, or you can govern them - and most teams have no inventory to govern from. Securing AI coding agents is no longer an edge case, because the 2025 Stack Overflow Developer Survey put adoption at 84% of developers using or planning to use AI tools, with 51% using them every day. That is the install base; the exposure is that the same survey found 46% do not trust the output. This guide is the practitioner's playbook for that gap: what the risk actually is, why your existing controls miss it, how the same attack patterns repeat across every vendor, and how to inventory, govern and audit the agent and CLI layer. The first move is to see what you have - and as Anomity's fleet visibility shows, you cannot govern what you cannot see.

The scale is still climbing. Gartner projects that 90% of enterprise software engineers will use AI code assistants by 2028, up from under 14% in early 2024. Bottom-up adoption at that rate is the same dynamic that made AI agents and MCP servers the new shadow IT: tools arrive through individual developers, not procurement, and no security tool reports them by default. Treat this guide as the companion to the sibling pillar on MCP server security - the agents below are the clients that consume those servers.

Throughout, every claim is anchored to a real disclosure. The thirteen advisories in this cluster - covering Amazon Q, Claude Code, Cline, Cursor, Gemini CLI, GitHub Copilot and OpenAI Codex - are the evidence that the patterns are not theoretical. Where a control matters, it links to how Anomity implements it, and to the specific advisory that proves the case.

What is an AI coding agent, and why is it an endpoint and not an app?

An AI coding agent is a tool that reads your code and context, plans a change, and then acts - running shell commands, editing files, calling networks and invoking MCP servers - usually on a developer laptop or a CI runner. That last word, acts, is the whole problem. A chatbot returns text you choose to use; an agent executes. The moment a tool can run npm install, write to disk or open a network connection on its own, it is an endpoint with the developer's privileges, not a passive application.

This reframes the security question. You are not asking "is this app vulnerable?" - you are asking "what is this autonomous process allowed to do on this machine, and who decided?" The Anomity model treats the coding agent and its CLI as one of eight AI artifact types it inventories per endpoint - AI agents, MCP servers, extensions, skills, plugins, secrets, hooks and CLIs - because the risk is distributed across all of them, not concentrated in a single binary.

Property	Traditional dev tool	AI coding agent / CLI
Acts autonomously	No - executes only explicit commands	Yes - plans and runs tool calls on its own
Reads untrusted input as instructions	No	Yes - READMEs, issues, comments, context files
Reachable attack surface	The app and its dependencies	The agent, its MCP servers, extensions, hooks and CLIs
Visible to EDR / DLP / network	Largely yes	No - actions run inside a trusted process
Update cadence	Monthly to quarterly	Often weekly; many findings ship with no CVE

The practical consequence: a control that watches the network or the disk arrives too late, because the agent's own approved tool call is the exfiltration path. Governance has to sit at the point of action - the tool call - which is where Anomity's allow/deny/log operates.

Why don't my existing controls - EDR, DLP, network - catch this?

Because the malicious action wears a trusted uniform. When Google Gemini CLI before 0.1.14 chained ; env and ; curl past an allowlisted grep prefix, the environment variables left the machine through the agent's own approved tool call. EDR saw a signed binary doing its job. The network saw ordinary outbound HTTPS. DLP saw nothing at rest. Tracebit disclosed this two days after Gemini CLI's June 25, 2025 release, and it shipped with no CVE - see the Gemini CLI silent code execution advisory for the full chain.

Each existing control was built for a different layer. EDR watches processes and won't flag a coding agent it is told to trust. DLP watches data at rest and in motion, not the semantics of a tool call. Network tools watch traffic, not which AI artifacts exist on an endpoint. None of them inventory the agent layer or evaluate whether a specific tool call should run. Anomity is explicit that it complements, not replaces Network, EDR, DLP and GRC tooling - it covers the artifact layer those tools were never designed to see, which is exactly the outcomes gap this cluster keeps exposing.

Control	What it sees	What it misses for coding agents
EDR	Process and binary behavior	A trusted agent running attacker-chosen tool calls
DLP	Data at rest / in motion	Secrets leaving through the agent's approved channel
Network	Outbound traffic patterns	Ordinary HTTPS that is actually exfiltration
GRC / CVE scanners	Known versions with CVEs	No-CVE findings and unsanctioned / fake installs
Anomity	The 8 AI artifact types per endpoint	(this is the layer - inventory + allow/deny/log + audit)

What attack patterns actually repeat across vendors?

The vendor names change; the patterns do not. Across the thirteen advisories in this cluster, five families recur, and recognizing them is more useful than memorizing CVEs - because the next disclosure will be a new instance of an old shape. The throughline is prompt injection, which OWASP ranks LLM01, the top LLM risk for the second consecutive edition.

Allowlist and auto-approve bypass

Agents let users mark commands or modes as "always allow," then fail to re-check what actually runs. Gemini CLI matched only the start of a command, laundering chained env/curl past an approved grep. GitHub Copilot's YOLO-mode flaw, CVE-2025-53773, turned auto-approve into remote code execution - detailed in the Copilot VS Code YOLO-mode RCE advisory.

MCP trust bypass

Coding agents trust their configured MCP servers, and that trust is swappable. Cursor's MCPoison (CVE-2025-54136) let an approved MCP entry be silently replaced with a malicious one for persistent code execution; CurXecute (CVE-2025-54135) turned an MCP prompt injection into RCE. See the MCPoison and CurXecute advisories.

CI token theft via attacker-controlled input

Agents in CI run with privileged tokens and read attacker-controllable strings. OpenAI Codex was exploited through a malicious branch name to steal a GitHub token (BeyondTrust Phantom Labs); the Claude Code GitHub Action allowed a permission bypass and secret exfiltration; and Cline's GitHub Actions compromise produced an unauthorized npm release of [email protected]. See the Codex branch-name injection, Claude Code Action bypass, and Cline npm release advisories.

Project-file and hook execution

Files the agent reads to understand a repo become an execution vector. Claude Code project-file RCE and API-token exfiltration covered CVE-2025-59536 and CVE-2026-21852; Cursor's Git-hooks sandbox escape (CVE-2026-26268) abused hooks for RCE. See the Claude Code project-file RCE and Cursor Git-hooks escape advisories.

Supply chain and fake installers

When tools arrive off-policy, the installer itself is the threat. An SEO-poisoning campaign (EclecticIQ) delivered fake Gemini CLI and Claude Code installers carrying an infostealer; the Amazon Q Developer VS Code extension shipped a wiper-style prompt injection (GHSA-7g7f-ff96-5gcw); and the Comment-and-Control campaign weaponized GitHub comments for credential theft across Claude Code, Gemini CLI and Copilot. See the fake-installer, Amazon Q wiper, and Comment-and-Control advisories.

How do IDE agents on laptops differ from CLI agents in CI?

Same software family, different blast radius. An IDE agent on a laptop runs with a developer's interactive trust and often an auto-approve mode, but a relatively small token surface. A CLI agent in CI usually holds a privileged publish or cloud token and processes input an external contributor controls - branch names, PR titles, comments - which is why CI flaws in this cluster reached the highest severities. The Gemini CLI run-gemini-cli CI flaw was scored CVSS 10.0 (GHSA-wpqr-6v78-jr5g); see the run-gemini-cli RCE advisory.

Laptop / IDE agents - interactive approval and auto-approve modes are the weak point; govern file writes outside the workspace and unexpected outbound calls, and inventory extensions and skills, not just the core agent.
CI / CLI agents - privileged tokens plus attacker-controllable input are the weak point; treat branch names, PR titles and comments as untrusted, and never let an agent's tool call reach a publish token without a decision at the hook.
Both - run the same inventory-then-govern model; the difference is which tool calls you deny and what a leaked credential can reach, captured in the 90-day audit trail.

How do I choose and roll out controls without blocking developers?

The instinct to ban these tools backfires. A blanket block pushes usage to personal accounts and sideloaded binaries - exactly the conditions the fake-installer campaign exploited - and at 84% adoption you are not blocking a fringe behavior, you are blinding yourself to a mainstream one. The better posture is to sanction specific tools, inventory every install (sanctioned or not), and govern behavior at the tool call. Anomity collects metadata only and redacts secrets on the endpoint, so visibility does not mean reading developers' code.

Use a simple selection matrix. Score any candidate control against whether it can enumerate the artifact layer, decide before a tool call runs, keep a queryable record, and do so without changing the developer's workflow. Controls that only react after the fact - alerting on an exfiltration already in flight - fail the first test that matters.

Requirement	Block-by-policy	EDR / network alerting	Anomity
Inventories AI agents, MCP, extensions, CLIs	No	No	Yes - 8 artifact types per endpoint
Decides before the tool call runs	N/A	No (post-hoc)	Yes - allow/deny/log at the hook
Keeps a queryable audit trail	No	Partial	Yes - 90-day, queryable
Surfaces no-CVE and fake installs	No	No	Yes - by installed version, not advisory feed
Zero developer workflow change	No (drives off-policy)	Yes	Yes - metadata only, on-endpoint redaction

For the framework to slot this into a broader program, the AI security framework report and the agentic AI governance guide map these controls to policy and ownership, and the docs cover deployment specifics.

Where do the secrets and tokens actually leak?

Follow the credential and the leak path becomes obvious. Coding agents sit next to the most valuable secrets a developer holds - the agent's own API key, environment variables, cloud credentials, and in CI a publish token - and every exfiltration in this cluster moved one of them through a channel the agent was already allowed to use. Gemini CLI read environment variables with env and shipped them with curl; the Claude Code GitHub Action leaked secrets through a permission bypass; Codex stole a GitHub token via a poisoned branch name. The pattern is consistent: the secret never has to be stolen from a vault when the agent will read it from the environment and send it through an approved tool call.

The agent's own API key - a project-file injection that turns the agent against itself can exfiltrate it, as the Claude Code project-file finding showed.
Environment variables - the default loot, because they are readable by any process the agent spawns; redact them on the endpoint so a captured context carries no credentials.
CI publish and cloud tokens - the highest-value target, reachable when an agent's tool call in a pipeline is not gated; this is what produced the unauthorized [email protected] release.
MCP server credentials - trusted transitively, and swappable, as MCPoison demonstrated.

The control that holds across all four is the same: collect metadata only, redact secrets on the endpoint before anything leaves, and decide allow/deny/log on the tool call that would carry a credential out. That is the runtime governance boundary, and it is why a leaked token does not have to become a leaked release.

What edge cases trip teams up?

No-CVE findings. Tracebit's Gemini CLI bug and the Comment-and-Control campaign shipped without a CVE, so CVE-keyed scanners miss them. Find vulnerable builds by the version installed, not by an advisory feed.
Two CVE years in one finding. The Claude Code project-file issue spans CVE-2025-59536 and CVE-2026-21852 - a single exposure can carry multiple identifiers across reservation years; do not assume one CVE closes it.
Trust that survives a restart. MCPoison persisted because the agent re-trusted a swapped config. Persistence means a one-time scan is not enough; you need continuous inventory.
The installer is the malware. SEO-poisoned fake Gemini CLI and Claude Code installers mean a clean checksum of the wrong binary is worthless - provenance and source matter as much as version.
Auto-approve as a default. Copilot's YOLO-mode RCE shows that a convenience mode can be the whole vulnerability; inventory which agents have auto-approve enabled, not just which are installed.

What should I do in the next 30 days?

Sequence the work so each step earns the next. The order matters: governance and audit are only as good as the inventory underneath them, and the no-CVE findings in this cluster mean you cannot lean on a scanner to tell you what you have.

Inventory the artifact layer. Enumerate every AI agent, CLI, extension, MCP server and hook across developer endpoints and CI runners - by installed version and provenance, using Anomity's fleet visibility, not an advisory feed.
Flag the known-bad versions. Cross-reference the inventory against this cluster: Gemini CLI before 0.1.14, [email protected], and the fixed builds for Cursor, Copilot and Claude Code. Confirm none are fake-installer binaries.
Turn on decisions at the hook. On agents that expose a hook, enable allow/deny/log starting in log-only mode to baseline, then deny the high-risk classes: chained shell commands, writes outside the workspace, and tool calls that touch publish tokens.
Audit the auto-approve modes. Find every agent with auto-approve enabled - the Copilot YOLO-mode RCE shows this is often the whole vulnerability - and require a decision instead.
Wire the trail to your SIEM. Route the 90-day audit trail to SIEM, Slack, email or Jira so the next disclosure is a query, not a fire drill, and map the program to the agentic AI governance guide.

How Anomity governs ai agent & cli security

Concretely, three steps on the endpoint where the agent runs - no agent rewrite, no developer workflow change.

One: inventory. Anomity enumerates the eight AI artifact types on every managed endpoint and CI runner - AI agents, MCP servers, extensions, skills, plugins, secrets, hooks and CLIs - and classifies them. That is how you answer "which endpoints run Gemini CLI before 0.1.14?", "where is [email protected]?", "which agents have auto-approve on?", and "did anyone install a fake Claude Code binary?" - by installed version and provenance, not by an advisory feed that misses the no-CVE cases.

Two: decide at the hook. On agents that expose a hook - for example Claude Code's PreToolUse - Anomity evaluates each tool call against your policy and returns allow, deny or log before the call runs. A shell command that chains ; env and ; curl past an approved prefix is evaluated on the whole command; a write outside the workspace, an MCP config swap, or an outbound call to an unexpected host is denied at the boundary. This is the control that stops a leaked CI token from reaching a publish action and a project-file injection from spawning RCE - the runtime governance layer the post-hoc tools lack. Anomity collects metadata only and redacts secrets on the endpoint, so a captured context never carries credentials off the machine.

Three: keep the record. Every install, version change, artifact and decision lands in a queryable 90-day audit trail, routed to SIEM, Slack, email or Jira. When the next disclosure lands - CVE or not - you answer which endpoints ran the affected build, which repositories fed the agent untrusted input, and what those agents were allowed to do, from a record rather than a guess. Anomity is SOC 2 Type II and complements your Network, EDR, DLP and GRC tooling. See how it works and how it compares for where it fits.

You can't govern what you can't see.The Anomity principle

The decision framework is the same one you started with, now with a third option that is neither block nor blind trust: sanction the tools, inventory every agent and CLI across the fleet, and govern each tool call at the hook with a queryable record behind it. The thirteen advisories in this cluster show the same five patterns recurring across Amazon Q, Claude Code, Cline, Cursor, Gemini CLI, Copilot and Codex - a per-tool patch is necessary but never sufficient, because the next disclosure is a new instance of an old shape. Start with the inventory, because you cannot govern what you cannot see. To see Anomity inventory and govern the agent and CLI layer across your fleet, request early access.

Frequently asked questions

What does "securing AI coding agents" actually mean in practice?

It means treating each coding agent and its CLI as a managed endpoint, not a desktop app. In practice that is three things: knowing which agents, extensions, MCP servers and CLIs are installed on every developer machine and CI runner; deciding allow, deny or log on each tool call the agent proposes before it runs, especially shell commands, file writes and network calls; and keeping a queryable record of what ran and what was blocked. The patch-and-pray model fails because most of these tools update weekly and many findings ship without a CVE. Anomity covers all three on the endpoint where the agent actually executes.

Why don't EDR, DLP and network tools already catch this?

Because the malicious action runs inside a trusted process using approved channels. When Gemini CLI chained ; env and ; curl past an allowlisted grep prefix, EDR saw a signed binary doing its job, the network saw ordinary outbound HTTPS, and DLP saw nothing at rest - the environment variables left through the agent's own approved tool call. EDR watches processes, DLP watches data flows, network tools watch traffic; none of them inventory which AI agents, MCP servers or CLIs exist on an endpoint or evaluate the semantics of a tool call. Anomity complements those controls by governing the AI artifact layer they cannot see.

What is prompt injection and why is it the root cause here?

Prompt injection is when untrusted content - a README, an issue comment, a context file, a branch name - is read by the agent and interpreted as an instruction rather than as data. OWASP ranks it LLM01, the top LLM risk for the second consecutive edition, because LLMs process instructions and data in the same channel with no hard separation. In this cluster it is the recurring root cause: Tracebit's Gemini CLI finding, the Comment-and-Control campaign across Claude Code, Gemini CLI and Copilot, the Cursor CurXecute MCP RCE, and the Amazon Q wiper all began as injected text. You cannot fully prevent it, so you govern what the agent is allowed to do after it reads the text.

Are CLI agents in CI riskier than IDE agents on a laptop?

They carry different blast radii, and CI is often worse. A CLI agent in a CI pipeline typically runs with a privileged token - a GitHub Actions token, an npm publish token, cloud credentials - and processes attacker-controllable input such as pull-request titles, branch names and comments. The Cline GitHub Actions compromise led to an unauthorized npm release of [email protected]; the Gemini CLI run-gemini-cli CI flaw reached CVSS 10.0; and OpenAI Codex was exploited through a malicious branch name to steal a GitHub token. IDE agents on a laptop have a smaller token surface but more interactive trust and auto-approve modes. Inventory and govern both; do not assume CI is hardened.

We already block AI tools by policy. Isn't that enough?

Policy without inventory is a guess. Developers adopt these tools bottom-up the way agents and MCP servers became the new shadow IT, and the 2025 Stack Overflow survey put adoption at 84% - far higher than most blocklists assume. A blanket block also pushes usage to unmanaged installs and personal accounts, which is harder to see, not easier. The SEO-poisoning campaign that delivered fake Gemini CLI and Claude Code installers laced with an infostealer thrives precisely when people sideload tools off-policy. A better posture is to allow sanctioned tools, inventory every install, and govern tool calls at the hook so you control behavior rather than pretend the tools are not there.

What is the single highest-value control to start with?

A complete, continuous inventory of the AI artifact layer on every developer endpoint and CI runner. You cannot govern, patch or audit what you have not enumerated, and most findings in this cluster turn on the specific version installed - Gemini CLI before 0.1.14, [email protected], the Cursor and Copilot fixed builds. An inventory also surfaces the no-CVE findings that scanners miss and the unsanctioned or fake installs that policy misses. Once you can see the fleet, the next control is allow/deny/log at the hook on agents that expose one, then a 90-day audit trail. Anomity delivers all three from one agent on the endpoint.

Does runtime governance slow developers down?

Done at the wrong layer, yes; done at the hook, no. Anomity evaluates each tool call against policy and returns a decision before the call runs, so ordinary read-only and edit operations pass without friction and only the risky ones - a chained shell command, an unexpected outbound call, a write outside the workspace - are denied or sent for review. Developers and employees do not have to change tools or workflow. The point is to remove the binary choice between "block the agent" and "trust it completely" by governing individual actions, which is far less disruptive than an outage caused by an agent that exfiltrated a token or wiped a workspace.

How does this relate to MCP server security?

Closely - MCP is how many coding agents reach tools and data, so it is a shared attack surface. Cursor's MCPoison flaw (CVE-2025-54136) let an approved MCP entry be silently swapped for a malicious one, and CurXecute (CVE-2025-54135) turned an MCP prompt injection into RCE. The same inventory-then-govern model applies: enumerate MCP servers as one of the eight AI artifact types, then decide allow/deny/log on the tool calls they mediate. For the MCP-specific surface - transport risks, tool poisoning and trust models - see the sibling pillar on MCP server security; this guide focuses on the agents and CLIs that consume those servers.