Cisco's AI Security and Safety Framework: What It Covers for AI Agents (2026)
- The Cisco Integrated AI Security and Safety Framework (arXiv:2512.12921, Dec 2025) is a research-grade taxonomy that merges *AI security* and *AI safety* into one model.
- Its core is a four-layer, MITRE ATT&CK-style structure: 19 objectives (OB-001–OB-019) → 40 techniques → 112 subtechniques → procedures, grouped into Common Manipulation, Data-Related, and Downstream/Impact risks.
- It embeds a 14-threat MCP taxonomy, a 22-threat supply-chain taxonomy, and a 25-category content-safety harm taxonomy, with a companion 17-threat agent-to-agent (A2A) taxonomy - the MCP and A2A layers are its most agent-relevant pieces.
- Cisco ties the framework operationally to its AI Defense product plus open-source MCP Scanner and A2A Scanner tools, and maps it to MITRE ATLAS, NIST AI 100-2, and the OWASP LLM/Agentic Top 10s.
- Operationalizing it for agents requires continuous discovery and monitoring of agents and MCP servers - the framework names the risks; it does not inventory your fleet for you.
In December 2025, Cisco published the Cisco Integrated AI Security and Safety Framework as a formal research report (arXiv:2512.12921). It is not a product datasheet and not a marketing pillar diagram - it is a lifecycle-aware taxonomy that tries to classify the full range of AI risk in one structure, the way MITRE ATT&CK did for enterprise intrusions. For security leaders drowning in partial standards, that ambition is the point.
The framework's defining move is merging two communities that usually talk past each other: AI security (protecting AI systems from unauthorized use, availability attacks, and integrity compromise) and AI safety (ensuring systems behave ethically, reliably, fairly, and in alignment with human values). Cisco argues that for agentic systems - software that plans, calls tools, and acts - the line between a safety failure and a security breach has effectively dissolved, so the taxonomy treats them as one. This guide walks through what the framework actually contains, how it maps onto AI agents and MCP servers, and how a security team puts it to work.
What the Cisco AI security framework is (and is not)
The framework is a taxonomy and operationalization model, authored by Cisco researchers Amy Chang, Tiffany Saade, Sanket Mendapara, Adam Swanda, and Ankit Garg, and released as version 1. Its stated goal is to classify AI risks across modalities, agents, pipelines, and the broader ecosystem in a single, structured reference rather than a checklist of point fixes.
One clarification matters before anything else: the framework is not the same thing as Cisco AI Defense, the commercial product. Cisco ties the framework to AI Defense operationally, alongside open-source MCP Scanner and A2A Scanner tools, but the framework itself is the published research artifact. When someone says "the Cisco AI security framework," they should mean arXiv:2512.12921, not the product brochure.
Cisco explicitly positions the framework as broader than the existing reference points. It argues that MITRE ATLAS, the NIST AI 100-2 adversarial ML (AML) taxonomy, and the OWASP Top 10s for LLM and Agentic AI applications each cover only a partial "slice" of AI risk, and that none of them integrate security with safety or treat the full lifecycle from data through runtime. The framework also maps its own items back to those references, so adopting it does not mean abandoning them - it means subsuming them under one structure.
The four-layer threat taxonomy
The core of the framework is a four-layer structure modeled directly on MITRE ATT&CK. It moves from intent down to observed behavior:
| Layer | Name | What it captures | Count |
|---|---|---|---|
| Layer 1 | Objectives | The "why" - attacker goals | 19 (OB-001–OB-019) |
| Layer 2 | Techniques | The "how" - methods mapped to objectives | 40 |
| Layer 3 | Subtechniques | Specific variants of techniques | 112 |
| Layer 4 | Procedures | Real-world / observed implementations | Library |
The 19 objectives are organized into three risk groups, which is the most useful mental model for triage:
- Common Manipulation Threats - objectives such as Goal Hijacking (OB-001), Communication Compromise (OB-004), jailbreaks, impersonation/obfuscation, and persistence.
- Data-Related Threats - objectives covering data privacy violations, integrity degradation and sabotage, supply-chain compromise, model theft and extraction, and adversarial evasion.
- Downstream Threats and Impact Risks - objectives covering action-space and integration abuse, availability abuse, privilege compromise, harmful or misleading content, surveillance, and cross-modal risks.
Techniques sit beneath objectives (labeled in an AITech-x.y scheme) - direct and indirect prompt injection, jailbreaks, memory corruption, tool exploitation, supply-chain tampering, and environment-aware evasion among them - and subtechniques (AISubtech-x.y.z) capture variants. Procedures are the bottom layer: documented, observed implementations attackers have actually used, which the report notes are not comprehensively cataloged. This ATT&CK-style nesting is what makes the framework usable for detection mapping rather than just awareness.
Five design elements
Cisco frames the whole structure around five deliberate design choices. They are worth naming because they explain why the taxonomy is shaped the way it is:
- Integration of AI security threats and content/safety harms into one taxonomy.
- AI lifecycle awareness, spanning data, training, and runtime rather than a single stage.
- Multi-agent coordination as a first-class concern, not an afterthought.
- Multimodality, covering text, image, audio, and cross-modal attack surfaces.
- An audience-aware utility that presents the same taxonomy at different depths for executives, security leaders, engineers, and red teams.
The embedded taxonomies
Beyond the four-layer core, the framework embeds supporting taxonomies for the surfaces that generic AI risk lists tend to flatten. These are where most of the agent- and infrastructure-specific detail lives.
The 14-threat MCP taxonomy
The most agent-relevant component is a dedicated Model Context Protocol (MCP) taxonomy of 14 threats across four groups: Injection & Interpretation (3), Tool Integrity (3), Data Exfiltration & Access (4), and Execution & Payload (4). It covers how an LLM interprets tools, prompts, metadata, and execution environments through MCP. Cisco treats MCP as critical agentic infrastructure and a major attack surface - consistent with what we see in the field and with the existence of Cisco's own MCP Scanner. For deeper grounding on this surface, see MCP Server Security: The Complete Guide and our writeup on MCP tool poisoning.
The 17-threat agent-to-agent (A2A) taxonomy
Cisco pairs the MCP taxonomy with a companion agent-to-agent (A2A) taxonomy of 17 threats, addressing the protocols that govern how agents discover, message, and delegate to one another - where impersonation, poisoned inter-agent messaging, and orchestration abuse live. It is published as a standalone taxonomy and operationalized through Cisco's open-source A2A Scanner, alongside the MCP Scanner. The v1 report text frames the agentic taxonomies as actively expanding, so expect the A2A items to keep evolving - but the 17-threat set is real and usable today, not merely promised.
The 22-threat supply-chain taxonomy
A separate supply-chain taxonomy enumerates 22 threats across four categories: artifact and format vulnerabilities, model manipulation and tampering (poisoning, malicious adapters), dependency and distribution compromise (typosquatting, CI/CD attacks), and operational/runtime threats (code execution, exfiltration). This is the layer that governs whether an MCP server or model artifact is safe to adopt in the first place - adjacent to our AI Supply-Chain Attacks: A Defender's Guide.
The 25-category content-safety harm taxonomy
Finally, a content-safety harm taxonomy spans 25 categories grouped into five families: cybersecurity and hacking, safety harms and toxicity (hate, self-harm, CBRN and similar), integrity compromise (hallucination, unauthorized financial/legal/medical advice), intellectual-property compromise, and privacy attacks (PII/PHI/PCI). This is the "safety" half of the integration made concrete and measurable.
How it applies to AI agents and MCP servers
Agentic surfaces are well covered across the framework: the core objectives carry the autonomy and tool-access risks, the MCP taxonomy carries the infrastructure risks, and the A2A taxonomy carries the inter-agent and orchestration risks. Read the agentic coverage as the union of those three places rather than a single dedicated layer.
With that in mind, the mapping from framework items to real agent risk is direct. The table below is our reading, intended as an operational bridge:
| Framework item | Agentic risk | What to monitor |
|---|---|---|
| Action-Space & Integration Abuse | Excessive agency - an agent executing unintended actions via its tools | Every agent's tool bindings and scope creep over time |
| 14-threat MCP taxonomy | Malicious or compromised MCP servers on the endpoint/fleet | MCP server inventory, tool integrity, injection and exfiltration paths |
| 17-threat A2A taxonomy | Impersonation and orchestration abuse between agents | Inter-agent messaging, delegation chains, multi-agent collusion |
| Goal Hijacking (OB-001) + Communication Compromise (OB-004) | Prompt-injection takeover and poisoned inter-agent messaging | Behavioral deviation in an agent's goals and call patterns |
| Persistence + Impersonation objectives | Shadow and unidentified agents operating invisibly | Discovery of unauthorized/unknown agents across the fleet |
| Privilege Compromise | Over-permissioned agents and credential misuse | What each agent and MCP server is authorized to touch |
| Data Privacy Violations + PII/PHI/PCI harms | Sensitive data leaking through context windows and tool calls | Data exposure across agent inputs and outputs |
| Supply Chain Compromise (22 threats) | Typosquatted MCP packages, poisoned artifacts, malicious adapters | Vetting status of MCP servers before fleet adoption |
Two of these deserve emphasis. Persistence and impersonation objectives map cleanly to the shadow-agent problem - agents and MCP servers that run without anyone knowing they exist. That is the same blind spot we describe in Why AI Agents and MCP Servers Are the New Shadow IT: you cannot tag a risk to an OB-ID for an agent your inventory has never seen. And multi-agent and A2A risks - collusion, orchestration abuse, poisoned delegation - only surface with fleet-wide cross-agent visibility; single-agent monitoring misses emergent behavior by construction. Our analysis of multi-agent prompt injection and credential theft shows how quickly that compounds.
How a security team operationalizes it
A taxonomy earns its keep when it changes what you do on Monday. A practical adoption path:
- Inventory first. Map your real AI footprint - agents, coding assistants, MCP servers - to the objectives and the MCP and A2A taxonomies. You cannot classify risk for assets you have not discovered.
- Tag detections to OB-IDs. Use the four-layer Objectives → Techniques → Subtechniques → Procedures structure as the labeling scheme for alerts. This gives GRC a consistent vocabulary and aligns red-team findings with defensive coverage.
- Prioritize by risk group. Triage with the three groups - Common Manipulation, Data-Related, Downstream/Impact - so leadership sees coverage gaps at a glance.
- Gate adoption with the supply-chain taxonomy. Vet MCP servers and model artifacts against the 22 supply-chain threats before they reach the fleet.
- Govern continuously, not at review time. The objectives describe runtime risks (drift, persistence, privilege creep) that a point-in-time audit cannot catch.
Step five is where most programs break. For coding agents specifically, the operational detail is heavier than a taxonomy can carry - see Securing AI Coding Agents and CLIs and Governing AI Coding Assistants Across Your Fleet for what that monitoring looks like in practice.
Where continuous agent and MCP visibility fits
The Cisco framework is a strong reference for *what* can go wrong. It does not, and is not meant to, tell you *which* agents and MCP servers exist on your endpoints and fleet right now - that is an inventory problem, not a taxonomy problem. The objectives that depend most on it (persistence, impersonation, privilege compromise, the entire MCP and A2A taxonomies) are exactly the ones that assume you already have continuous discovery in place.
This is the category Anomity works in: discovering and inventorying every AI agent and MCP server, monitoring their permissions and behavior, alerting on anomalies, and producing an audit trail. The four-layer structure gives that audit trail a ready vocabulary - alerts tagged to OB-IDs and techniques for GRC reporting and red-team alignment. The framework supplies the taxonomy; continuous visibility supplies the subjects it classifies. If you want to see what that discovery surfaces in real environments, what we find when we scan AI agent configs is a candid starting point.
Adopt the Cisco framework as your shared language for AI risk. Then make sure your fleet is actually visible enough to use it - because, as ever, you can't govern what you can't see.
Frequently asked questions
What is the Cisco AI security framework?
It is the Cisco Integrated AI Security and Safety Framework, a research report (arXiv:2512.12921) published in December 2025 by Cisco's AI threat and security researchers. It is a unified, lifecycle-aware taxonomy that classifies AI risks across modalities, agents, pipelines, and the broader ecosystem, merging AI security with AI safety in one structure.
Is the Cisco AI security framework the same as Cisco AI Defense?
No. The framework is a published taxonomy and operationalization model. Cisco AI Defense is a separate commercial product. Cisco ties the framework operationally to AI Defense plus open-source MCP Scanner and A2A Scanner tools, but the framework itself is the research artifact (arXiv:2512.12921), not the product.
How many objectives and techniques does it define?
The four-layer taxonomy defines 19 attacker objectives (OB-001 through OB-019), 40 techniques, and 112 subtechniques, plus a library of real-world procedures. The 19 objectives are grouped into Common Manipulation, Data-Related, and Downstream/Impact risk categories.
Does the Cisco framework cover AI agents and MCP servers?
Yes. It embeds a dedicated 14-threat Model Context Protocol (MCP) taxonomy across four groups: Injection & Interpretation, Tool Integrity, Data Exfiltration & Access, and Execution & Payload. Cisco also publishes a companion 17-threat agent-to-agent (A2A) taxonomy, paired with its open-source A2A Scanner, and addresses agentic risks like tool access and orchestration abuse within the core objectives.
How is the framework different from MITRE ATLAS, NIST AML, and OWASP LLM Top 10?
Cisco positions it as broader than those, arguing each covers only a partial slice of AI risk. The Cisco model integrates security and safety, spans the full AI lifecycle, and adds embedded MCP, A2A, supply-chain, and content-safety taxonomies in one structure. It also maps its items to MITRE ATLAS, the NIST AI 100-2 adversarial ML taxonomy, and the OWASP Top 10s for LLM and Agentic AI applications.
What is the difference between AI security and AI safety in this framework?
AI security covers protecting AI systems from unauthorized use, availability attacks, and integrity compromise. AI safety covers ensuring systems behave ethically, reliably, fairly, and in alignment with human values. The framework's signature move is merging both into one taxonomy rather than treating them as parallel tracks.
How does a security team operationalize the Cisco framework?
Map your real AI inventory to the objectives and MCP/A2A taxonomies, tag detections and alerts to OB-IDs for GRC and red-team alignment, prioritize controls by the three risk groups, vet artifacts against the supply-chain taxonomy, and feed continuous agent and MCP discovery into the model. The framework provides the taxonomy; you supply the inventory and monitoring.




