← Back to blog Guide

Writing an AI Acceptable Use Policy for Agents and MCP That Reduces Shadow AI Instead of Driving It Underground

Anomity Research Anomity Research · Jun 21, 2026 · 12 min read

TL;DR

Blanket AI bans do not stop usage. They push it onto personal devices and accounts where you have zero visibility, trading a manageable risk for an unmonitored one.
IBM's 2025 Cost of a Data Breach Report found shadow AI involved in 20 percent of breaches, adding roughly $670K to breach cost, and 97 percent of orgs with an AI-related breach lacked proper AI access controls.
The policy's spine is a tiered tool taxonomy (sanctioned, tolerated, prohibited) paired with a data-classification matrix defining what data may touch which tier.
Treat MCP servers as third-party software: provenance review, OAuth 2.1 plus PKCE, least-privilege scopes, and an approved-server registry.
Every clause needs a detectable signal behind it: network egress, CASB, browser DLP, AI/MCP gateway, and OAuth consent audits. Policy without visibility is theater.
Map the AUP to NIST AI RMF, ISO/IEC 42001, and the EU AI Act so it is citable, audit-ready, and survives a compliance review.

Most AI acceptable use policies fail the same way. A security team, alarmed by employees pasting customer data into a chatbot, writes a one-line rule: do not use AI tools. Six months later, AI usage has not dropped. It has moved. It is on personal phones, personal accounts, and home networks, and the security team can no longer see any of it.

This is the central problem an AI acceptable use policy has to solve in 2026. The goal is not to eliminate AI use, which is neither realistic nor desirable. The goal is to keep AI use on infrastructure you can observe and govern. A policy that drives usage underground has not reduced risk; it has blinded itself to risk. This guide walks through how to write an AUP that does the opposite, with specific attention to the layer most policies ignore entirely: autonomous AI agents and Model Context Protocol (MCP) servers.

What an AI acceptable use policy actually is

An AI acceptable use policy is the governing document that defines three things: which AI tools and capabilities people may use, what data is allowed to flow into them, and how that usage is monitored and enforced. A traditional AUP covers software and web access. An AI AUP narrows in on a category of tools that behave unlike anything that came before, because modern AI does not just receive data. Agents read it, reason over it, call other tools, and take actions on a user's behalf.

That distinction matters for scope. A policy written for ChatGPT-style chatbots covers a fraction of the surface. The fast-growing and least-governed part of the fleet is agentic: coding assistants that execute shell commands, agents that query databases through MCP servers, and workflows that chain tool calls together. We have argued before that AI agents are the new shadow IT, and an AUP that does not name agents and MCP servers explicitly leaves the riskiest tools unaddressed.

Why blanket bans backfire

The evidence that bans do not work is consistent. Across 2025 surveys, roughly half of employees admitted to using unsanctioned AI tools at work, a large majority reported bringing their own tools, and close to half said they would continue using personal AI even if their employer formally banned it. Samsung's 2023 decision to restrict ChatGPT after engineers pasted source code into it is the canonical example of the reflex, and the reflex is understandable. It is also counterproductive when it is the entire strategy.

The reason is mechanical, not cultural. When a tool is sanctioned, its traffic runs through corporate identity, corporate networks, and corporate logging. When it is banned, the demand does not vanish, so the same work happens on a personal device with a personal account over a home connection. You have not removed the data exposure. You have removed your ability to detect it. The risk is now identical in substance and invisible in practice.

IBM's 2025 Cost of a Data Breach Report puts numbers on the cost of that blindness. Shadow AI was involved in 20 percent of breaches, and those incidents added roughly $670,000 to the average breach cost. The governance gap underneath is the more damning finding: 97 percent of organizations that suffered an AI-related breach lacked proper AI access controls, and 63 percent of breached organizations either had no AI governance policy or were still developing one. Among organizations that did have a policy, only about a third audited for unsanctioned AI use at all. Shadow AI incidents disproportionately exposed personally identifiable information and intellectual property.

Read those last figures together. Having a policy on paper and detecting violations of it are different capabilities, and most organizations have the first without the second. That gap is the theme of this guide: a clause you cannot detect is a clause you cannot enforce.

The spine of the policy: tiers and data classification

A workable AUP is built on two intersecting structures. The first is a tiered tool taxonomy that sorts AI tools by how much trust they have earned. The second is a data classification matrix that defines what kind of data may touch each tier. Neither works alone. Tiers without data classes tell you what tool to use but not what to put in it; data classes without tiers tell you how sensitive your data is but not where it is allowed to go.

The three tiers

Sanctioned - tools that have been vetted, contracted with enterprise data protections, and wired into SSO and logging. This is the paved road. Make it genuinely good, because the paved road only reduces shadow AI if it is easier than the alternative.
Tolerated - tools permitted for non-sensitive (Public and Internal) data only, while under active monitoring and pending full vetting. This tier is the policy's most important and most often omitted piece. It gives people a legitimate, observable path to use a tool you have not finished evaluating, which is exactly what keeps usage from going underground.
Prohibited - banned tools, or banned combinations of tool and data, such as feeding regulated data into a consumer free tier whose terms allow training on inputs. Prohibition should be specific and justified, not a blanket reflex, so that the rules people are most likely to break are the ones that matter most.

The tolerated tier is the design choice that separates a policy that reduces shadow AI from one that grows it. A binary sanctioned-or-banned model forces every new tool into prohibition until vetting finishes, and vetting always lags demand. The tolerated tier acknowledges reality: people will try new tools, so give them a supervised lane rather than pretending they will wait.

The data classification matrix

Pair the tiers with a small, legible set of data classes. Four is usually enough: Public, Internal, Confidential, and Restricted/Regulated. The last class is where PII, PHI, PCI data, secrets and credentials, and proprietary source code live. The matrix below is the single most quotable artifact in the policy, because it answers the question employees actually ask: can I put this in that?

Data class	Sanctioned tier	Tolerated tier	Prohibited tier
Public	Allowed	Allowed	N/A
Internal	Allowed	Allowed (monitored)	Not permitted
Confidential	Allowed (approved tools only)	Not permitted	Not permitted
Restricted / Regulated (PII, PHI, PCI, secrets, source code)	Allowed only in tools with a signed data-protection agreement and no training on inputs	Prohibited	Prohibited

The matrix also forces a useful conversation with legal and privacy. Inputs to consumer LLM free tiers may be used to train the provider's models, which is the practical reason regulated data cannot go there, and it ties directly to GDPR obligations. If your agents process personal data, the GDPR-for-AI-agents lawful-basis and erasure questions belong in the same policy review.

How the threat model changes for agents and MCP

Chatbot policies assume a human in the loop reading every output. Agentic AI breaks that assumption. An agent reads untrusted content, decides what to do, and acts, often without a human checking each step. That shift introduces failure modes a chatbot-era AUP never had to consider, and the policy needs to name them so its clauses have a rationale.

The clearest organizing concept is Simon Willison's lethal trifecta, coined in June 2025: the combination of access to private data, exposure to untrusted content, and the ability to communicate externally. Any agent or MCP configuration that holds all three is exploitable, because an attacker can plant instructions in the untrusted content (a web page, a ticket, a document) and ride the agent's private-data access out through its external channel. We cover the mechanics in the lethal trifecta and agent data exfiltration. The AUP rule that follows is simple to state: no agent configuration may combine all three legs of the trifecta without explicit compensating controls and approval.

Prompt injection - ranked the number one risk (LLM01:2025) in the OWASP Top 10 for LLM Applications. Indirect injection arrives through content the agent reads, so the policy needs an untrusted-content handling clause. See indirect prompt injection explained.
Tool poisoning - hidden malicious instructions embedded in an MCP server's tool descriptions or metadata, which the model reads and obeys. This is the direct motivation for the MCP provenance and approval clause.
Memory injection - research such as the MINJA attack (arXiv 2503.03704, a NeurIPS 2025 poster) showed an agent's persistent memory can be poisoned through ordinary queries so that a later victim query is compromised. Policies that allow persistent-memory agents need behavior monitoring to match.
Confused-deputy and token reuse - an agent holding a broad token can be tricked into using it on an attacker's behalf, which is why per-resource, least-privilege scoping matters more than for ordinary apps.

The MCP server approval clause

The most common gap in 2026 AI policies is that they govern models but ignore the connective tissue. MCP servers are software that grants agents new capabilities, and they must be treated as third-party software subject to review, not as plugins anyone can install. A single compromised server in a chain of connected MCP servers is enough to compromise the agent, and security researchers have demonstrated high attack-success rates once one untrusted server is in the loop.

A defensible MCP clause covers provenance, authentication, scope, and registry:

Provenance and supply chain - only approved servers from known sources, with pinned versions. Arbitrary remote MCP servers are prohibited by default. The broader pattern is covered in our MCP server security guide.
Authentication - internet-accessible MCP servers must implement OAuth 2.1 with PKCE, consistent with the MCP authorization spec and the underlying RFCs for protected-resource metadata (RFC 9728) and resource indicators (RFC 8707). See OAuth for MCP servers explained.
Least-privilege scopes - per-resource tokens, narrowly scoped, no shared broad credentials. This is least privilege for AI agents applied to the tool layer.
Approved-server registry - a maintained inventory of which servers are allowed, who owns each, and what scopes they hold. Building one is the subject of how to build an MCP server registry.

The OWASP MCP Top 10, published in 2025, gives this section a standards backbone to cite, with named risk categories from token mismanagement and tool poisoning to context injection and over-sharing. Anchoring MCP clauses to a named framework is what makes them survive an audit rather than reading as one team's preferences.

AI coding assistants and agentic IDEs deserve their own section because they quietly satisfy every leg of the lethal trifecta. They read source code (private data), execute tools and MCP servers (action), and can reach the network (external communication). The 2025 class of agent vulnerabilities, including agents that could be steered to exfiltrate tokens through injected content, is the real-world illustration of why this surface needs explicit rules rather than the same clause you wrote for chatbots.

A developer-tool clause should specify what repositories and secrets agents may access, whether tool calls auto-run or require human approval, which MCP servers are allowed in the development environment, and an unambiguous prohibition on pasting proprietary code into unvetted consumer assistants. Permission posture varies sharply between tools, which is why securing AI coding agents and CLIs and the fleet-level view in governing AI coding assistants across your fleet are useful companions to this section. Auto-run, sometimes called YOLO mode, is the setting most worth a default-off rule in the policy.

Enforcement: map every clause to a signal

This is the section that separates a real policy from a wish. Every clause should be tied to a detectable signal in a control layer you already operate. If you cannot describe how a violation would be observed, the clause is theater. The layers differ in what they can see and what they can do about it, so list both.

Policy clause	Where you see it	What it can do
Which AI domains employees reach, and who	Network / secure web gateway egress logs	Warn, block, redirect
Unsanctioned apps and personal-account use	CASB / SaaS telemetry, OAuth token inventory	Alert, revoke, block app
Pasting PII, source code, or secrets into AI tabs	Browser-layer DLP and extension discovery	Warn, block paste, isolate
Risky OAuth scopes granted to AI tools	Identity / OAuth consent audit	Flag, revoke consent
Agent and MCP tool calls and scope use	AI / MCP gateway	Inspect, enforce scope, block
Lethal-trifecta configurations in the fleet	Agent/MCP discovery and inventory	Detect, alert, require approval

Two practical notes. First, the browser is increasingly the right enforcement point for shadow AI, because that is where employees paste data and install AI extensions; enterprise browser management lets you discover unsanctioned AI extensions and apply DLP at the moment of paste. Second, what most teams cannot see at all is the agent and MCP layer, because none of the legacy controls were built to inventory autonomous tools. That gap is precisely the one DLP for AI agents describes when it explains why traditional DLP misses agent traffic.

Framework alignment makes the policy citable

An AUP that maps to recognized frameworks is easier to defend internally and easier for an auditor (or an answer engine) to trust. Three mappings carry most of the weight, and they play different roles.

Framework	Nature	Role for the AUP
NIST AI RMF	Voluntary, process-focused (Govern, Map, Measure, Manage)	The AUP is the Govern artifact that operationalizes it
ISO/IEC 42001	Certifiable management-system standard	Provides the management-system scaffolding the policy plugs into
EU AI Act	Mandatory, outcome-focused (Article 10 data governance; high-risk obligations applying from Aug 2, 2026)	Sets binding obligations the policy must satisfy for in-scope systems

Frame the AUP as the document that turns these frameworks into day-to-day rules. For deeper treatment, see the NIST AI RMF for AI agents, ISO/IEC 42001 agent governance, and EU AI Act for AI agents guides. The point of citing them is not box-checking; it is that a policy anchored to external standards outlives the team that wrote it.

A template skeleton you can adapt

Keep the document short enough that people read it. A workable structure:

Purpose and scope - who and what is covered, including agents and MCP servers, not just chatbots.
Definitions - sanctioned, tolerated, prohibited; the data classes; agent and MCP server.
Tier and data-class matrix - the table from earlier, made specific to your tool list.
MCP server approval - provenance, OAuth 2.1 with PKCE, least-privilege scopes, registry.
Developer-tool rules - repo and secret access, auto-run posture, allowed dev MCP servers.
Monitoring and enforcement - the clause-to-signal mapping and what each layer may do.
Exceptions and approvals - a real workflow for requesting a tolerated or new tool, with an owner and an SLA.
Review cadence - a fixed schedule (quarterly is reasonable given how fast this moves) and a named owner.

The exception workflow deserves emphasis. The fastest way to push usage underground is to make the approved path slow or invisible. A lightweight request form with a committed turnaround does more to reduce shadow AI than any prohibition, because it makes compliance the path of least resistance.

Where continuous agent and MCP visibility fits

The recurring constraint in everything above is visibility. The tolerated tier assumes you can watch what flows through it. The MCP registry assumes you know which servers exist. The lethal-trifecta rule assumes you can detect configurations that combine all three legs. Enforcement assumes you can see violations. Strip out the visibility and the entire policy reverts to paper.

This is the category Anomity works in. Discovering and inventorying every AI agent and MCP server across the fleet is what makes the tolerated and unknown tiers governable rather than theoretical; monitoring permissions and behavior is what surfaces risky OAuth scopes, lethal-trifecta configurations, and prohibited data flows; and the audit trail is what your framework mappings demand when an auditor asks for evidence. You can read how we approach discovery in inside Anomity discovery and the broader case in why we built Anomity.

The shorter version is the line we keep coming back to. A policy tells people what to do. Visibility tells you whether they did it. You can write the best AI acceptable use policy in your industry, but if you cannot see the agents and MCP servers it governs, you have not reduced shadow AI. You have only documented your hope that it went away.

Frequently asked questions

What is an AI acceptable use policy?

An AI acceptable use policy (AUP) is the governing document that defines which AI tools, agents, and MCP servers employees may use, what data may flow into them, and how usage is monitored and enforced. For agentic AI it must go beyond chatbots to cover autonomous agents that read data, call tools, and act on systems on a user's behalf.

Why do blanket AI bans backfire?

Bans do not remove the demand for AI; they remove your visibility into it. Surveys in 2025 found roughly half of employees use unsanctioned AI tools and nearly half say they would keep using personal AI even if it were formally banned. A ban pushes that activity onto personal laptops, personal accounts, and home networks where no monitoring control can see it, converting a governable risk into an invisible one.

What are the tiers in a tiered AI tool taxonomy?

Three tiers: Sanctioned tools (vetted, contracted, with SSO and logging, the paved road), Tolerated tools (permitted for non-sensitive data only while under active monitoring and pending full vetting), and Prohibited tools and uses (banned tools or banned data-tool combinations). The tolerated tier is the pressure-release valve that keeps usage in the open.

How should an AUP handle MCP servers?

Treat MCP servers as third-party software requiring review, not as freely installable add-ons. The policy should require provenance and supply-chain checks, pinned versions, OAuth 2.1 with PKCE for internet-accessible servers, least-privilege scopes, and an approved-server registry. Arbitrary remote MCP servers should be prohibited by default.

What is the lethal trifecta and why does it belong in an AI policy?

Coined by Simon Willison in June 2025, the lethal trifecta is the combination of access to private data, exposure to untrusted content, and the ability to communicate externally. Any agent or MCP configuration that has all three is exploitable via prompt injection, so the AUP should prohibit such configurations unless compensating controls are in place.

How do you enforce an AI acceptable use policy?

Map every clause to a detectable signal across the controls you already run: network and secure web gateway egress logs, CASB and SaaS telemetry, browser-layer DLP and extension discovery, endpoint agents, an AI or MCP gateway for tool-call inspection, and OAuth consent audits for risky scopes. A clause with no detection behind it is unenforceable.

Which frameworks should an AI AUP align to?

Map the AUP to NIST AI RMF (Govern, Map, Measure, Manage), ISO/IEC 42001, and the EU AI Act, with GDPR for personal data in prompts. Treat the AUP as the Govern artifact that operationalizes these frameworks, which makes it audit-ready and citable in a compliance review.

Should developer AI coding tools be covered by the AUP?

Yes, and they are often the biggest blind spot. AI coding assistants and agentic IDEs read source code, execute tools and MCP servers, and can exfiltrate data, which is a textbook lethal-trifecta surface. The policy must address repo and secret access, auto-run versus human approval for tool calls, allowed MCP servers in development, and a prohibition on pasting proprietary code into unvetted consumer assistants.