OWASP Top 10 for LLM Applications (2025): A Security Team's Breakdown
- The OWASP Top 10 for LLM Applications is a community-built, vendor-neutral list of the most critical risks in LLM-backed software, published by the OWASP GenAI Security Project under CC BY-SA 4.0.
- The 2025 edition (released 18 November 2024) keeps Prompt Injection (LLM01) at #1, moves Sensitive Information Disclosure to #2, and adds two new categories: System Prompt Leakage (LLM07) and Vector and Embedding Weaknesses (LLM08).
- Excessive Agency (LLM06) is the closest anchor for agentic and MCP risk - a system granted more functionality, permissions, or autonomy than it needs.
- The list is application-centric: it only partially covers autonomous-agent and MCP-specific threats. OWASP's companion deliverables (Agentic AI – Threats and Mitigations, and the OWASP Top 10 for Agentic Applications) go deeper.
- Operationalizing it for agents means inventorying every agent and MCP server, monitoring per-tool permissions, watching for behavioral anomalies, and keeping an audit trail.
If you are securing anything built on large language models, the OWASP Top 10 for LLM Applications is the first reference you should put in front of your team. It is the most widely cited list of the critical security risks in LLM-backed software, and it has quietly become the shared vocabulary security engineers, GRC teams, and AI platform owners use to talk about a problem space that did not exist a few years ago.
The catch: the list was designed for *applications* that call a model, not for *autonomous agents* that call tools, spawn sub-agents, and reach across your environment through MCP servers. That gap matters. This guide breaks down the 2025 taxonomy accurately, then maps each entry onto the agentic reality your fleet is already living in - and points to where OWASP's own newer work picks up where the LLM Top 10 stops.
What the OWASP Top 10 for LLM Applications actually is
The list is published by the OWASP GenAI Security Project - an OWASP Foundation effort that was previously named the OWASP Top 10 for LLM Applications project. Like every OWASP Top 10, it is community-built, vendor-neutral, and openly licensed (Creative Commons Attribution-ShareAlike 4.0). It is not a certifiable compliance standard. It is a *prioritized risk reference*: a ranked, well-documented starting point for threat modeling, test design, and control requirements.
The original list shipped in 2023. The current 2025 edition was released on 18 November 2024. The 2025 revision reordered and consolidated the earlier entries based on real-world production data and community feedback, and added two brand-new categories. Each risk carries a stable identifier of the form LLMNN:2025 and a consistent structure: a definition, common examples, example attack scenarios, prevention and mitigation strategies, and reference links.
That LLM01:2025 style of ID is worth adopting in your own tickets and findings. It gives you a durable, year-stamped handle that survives reordering between editions and lets auditors and engineers point at the same thing.
The 2025 taxonomy, entry by entry
Here is the full 2025 list. Two entries - LLM07 System Prompt Leakage and LLM08 Vector and Embedding Weaknesses - are new this edition. The latter is OWASP's first dedicated treatment of Retrieval-Augmented Generation (RAG) and embedding-store risk.
| ID | Risk | What it covers |
|---|---|---|
| LLM01:2025 | Prompt Injection | User or external content manipulates prompts to alter intended behavior, bypass guardrails, or trigger unintended actions. Includes direct and indirect (data-borne) injection. Remains #1. |
| LLM02:2025 | Sensitive Information Disclosure | Exposure of PII, secrets, proprietary or training data through model outputs and surrounding systems. Rose to #2 in 2025. |
| LLM03:2025 | Supply Chain | Vulnerabilities across the LLM pipeline: third-party models, datasets, pre-trained weights, adapters/LoRAs, plugins, and dependencies. |
| LLM04:2025 | Data and Model Poisoning | Malicious or corrupted data introduced during pre-training, fine-tuning, or embedding to manipulate behavior, plant backdoors, or degrade integrity. |
| LLM05:2025 | Improper Output Handling | Insufficient validation, sanitization, or encoding of model output before it reaches downstream systems - enabling XSS, SSRF, SQLi, RCE, or privilege escalation. |
| LLM06:2025 | Excessive Agency | A system granted more functionality, permissions, or autonomy than necessary, letting it take harmful actions via tools/extensions/plugins. |
| LLM07:2025 | System Prompt Leakage | NEW. Disclosure of system/developer prompts that reveal instructions, guardrails, secrets, or logic an attacker can exploit. |
| LLM08:2025 | Vector and Embedding Weaknesses | NEW. Flaws in vector stores and embeddings used by RAG: embedding inversion, data-store poisoning, cross-tenant leakage, access-control gaps. |
| LLM09:2025 | Misinformation | Generation and propagation of false, misleading, or hallucinated information users may over-rely on, leading to unsafe decisions. |
| LLM10:2025 | Unbounded Consumption | Uncontrolled compute, query, or cost usage causing denial of service, denial of wallet, or model extraction. Expands the earlier 'Model Denial of Service' concept. |
A few framing notes for accuracy. The list is *ranked* - the ordering reflects risks observed in production LLM applications, not a strict severity score. And the entries are not mutually exclusive: a single real incident commonly chains several of them (an indirect prompt injection that drives improper output handling that exploits excessive agency, say). Treat the IDs as labels for a graph of related failure modes, not ten isolated boxes.
How the list maps to AI agents and MCP servers
An LLM application has a relatively contained blast radius: a prompt goes in, text comes out, maybe a function call or two. An AI agent is a different animal. It plans, calls tools, reads and writes data, invokes MCP servers, and - increasingly - coordinates with other agents. Every LLM Top 10 risk takes on extra weight when the model can *act* rather than merely *advise*. Here is how the entries land in an agentic environment.
LLM06 Excessive Agency - the core agentic anchor
This is the entry that most directly describes the agent problem. OWASP defines Excessive Agency as a system granted more functionality, permissions, or autonomy than it needs. An autonomous agent wired to a dozen MCP servers with broad scopes is the textbook case: excessive functionality (tools it never needs), excessive permissions (write access where read would do), and excessive autonomy (high-impact actions with no human in the loop). OWASP's prescribed mitigations - least privilege, human-in-the-loop for high-impact actions, downstream authorization, and logging and rate-limiting of tool invocations - read like a control list for an agent fleet. The hard part is not the controls; it is knowing what agents and tools exist in the first place. We dug into this in Why AI Agents and MCP Servers Are the New Shadow IT.
LLM01 Prompt Injection - the path to weaponizing agency
Prompt injection is how an over-privileged agent gets hijacked. Indirect injection - malicious instructions buried in a document, a web page, or a tool's response - can steer an agent into invoking MCP tools or exfiltrating data without the user ever typing the malicious prompt. Static input filtering misses most of it because the payload arrives through data the agent was told to trust. This is why behavioral anomaly detection on agent *actions* matters: an agent that suddenly reads a credentials file and opens an outbound connection is deviating from its baseline regardless of how the instruction got in. The multi-agent case is especially nasty; we walk through it in Multi-agent prompt injection and credential theft.
LLM03 Supply Chain - where shadow MCP servers live
The Supply Chain entry covers third-party models, datasets, weights, adapters, plugins, and dependencies. For agents, the sharpest version is the unvetted MCP server a developer installs in minutes without any review. Each one is a new dependency with its own code, network behavior, and trust assumptions. Inventorying every MCP server and flagging the unvetted ones is, squarely, a supply-chain control. We cover the defender's side in AI Supply-Chain Attacks: A Defender's Guide, the server side in MCP Server Security: The Complete Guide, and a real campaign in MCP Tool Poisoning.
LLM05 Improper Output Handling - when output becomes an action
In a chatbot, unsanitized output is a display problem. In an agent, output frequently *becomes* the next tool call or a shell command executed downstream. Improper output handling stops being XSS and starts being delete, transfer, or exec. The control is to treat every agent output crossing a boundary as untrusted input to the next system - validate, encode, and authorize at the receiving end, not just at the prompt. This is acute for coding agents and CLIs, which we cover in Securing AI Coding Agents and CLIs.
The rest, briefly
- LLM02 Sensitive Information Disclosure - agents and MCP servers with read access to repos, CRMs, and databases can silently move sensitive data off the endpoint. The control is fleet-wide data-access visibility plus an audit trail.
- LLM04 Data and Model Poisoning and LLM08 Vector and Embedding Weaknesses - a RAG-backed agent acting on a poisoned knowledge base or leaking across tenants steers autonomous decisions, not just answers. Embedding inversion and data-store poisoning become action-level risks.
- LLM07 System Prompt Leakage - agent and MCP system prompts often encode tool scopes, credentials, or guardrail logic. Leaking them hands an attacker the map to abuse the agent's agency.
- LLM09 Misinformation - when an agent *executes* on a hallucinated fact instead of merely surfacing it, the harm is amplified.
- LLM10 Unbounded Consumption - runaway agent loops and recursive tool/MCP calls produce denial-of-wallet. Rate-limiting and monitoring tool invocation are the controls, and they overlap with the Excessive Agency mitigations.
The honest scope caveat
Be precise with leadership about what this list does and does not cover. The LLM Top 10 is application-centric. It only partially addresses autonomous-agent and MCP-specific threats: shadow agents running on endpoints you never approved, agent identity and authentication, trust between agents in a multi-agent system, and tool misuse at scale. OWASP knows this, which is why it treats deeper agent risk in companion deliverables rather than stretching this list to fit.
Those companions are the more precise references for the agentic problem space: the Agentic AI – Threats and Mitigations guide from OWASP's Agentic Security Initiative, and the separate OWASP Top 10 for Agentic Applications, released in December 2025. If your environment is agent-heavy, read the LLM Top 10 as the on-ramp and the agentic deliverables as the destination.
Operationalizing it across a fleet
A list of risks is only useful if each entry resolves to a control and a signal you can actually collect. Here is a practical way to translate the taxonomy into operational work for an agent and MCP fleet.
| OWASP entry | Agentic manifestation | What to monitor / enforce |
|---|---|---|
| LLM06 Excessive Agency | Agents with broad tool scopes and no approval gates | Per-tool permission inventory; least privilege; human-in-the-loop on high-impact actions; rate-limit invocations |
| LLM01 Prompt Injection | Indirect injection via documents, web, tool responses | Behavioral baselines on agent actions; anomaly alerts on out-of-pattern tool calls and egress |
| LLM03 Supply Chain | Unvetted / shadow MCP servers | Discover and inventory every MCP server; flag unvetted ones; review before enablement |
| LLM02 Sensitive Info Disclosure | Agents reading repos/CRMs/DBs and moving data | Data-access visibility; egress monitoring; tamper-evident audit trail |
| LLM05 Improper Output Handling | Agent output becomes a downstream command | Validate/sandbox output at the boundary; authorize at the receiving system |
| LLM10 Unbounded Consumption | Recursive agent/tool loops, denial-of-wallet | Invocation rate limits; cost and loop-depth monitoring with alerts |
A workable sequence for a security team adopting the list:
- Inventory first. You cannot apply least privilege or supply-chain review to agents and MCP servers you have not discovered. Establish a continuous inventory of every agent and MCP server across the fleet - this underpins LLM03 and LLM06 directly.
- Map permissions to scope. For each agent and tool, record what it *can* do versus what it *needs* to do. The delta is your Excessive Agency exposure.
- Baseline behavior, then alert on deviation. Static filters miss injection-driven and poisoning-driven attacks; deviations in agent action patterns catch them.
- Validate output at every boundary so a hijacked or hallucinating agent cannot turn text into a destructive action.
- Keep an audit trail of tool invocations and data access - both a control and your evidence base for incident response and compliance.
For teams standardizing this across many developers, we go deeper in Governing AI Coding Assistants Across Your Fleet and on the discovery problem in What We Find When We Scan AI Agent Configs.
Where continuous agent and MCP visibility fits
Read the operational table again and a pattern emerges: almost every control assumes you already know which agents and MCP servers exist, what permissions they hold, and how they normally behave. That assumption is exactly what most organizations cannot satisfy today. The LLM Top 10 tells you *what* to govern; it is silent on the prerequisite - *seeing* the agent layer in the first place.
That visibility layer is the category Anomity works in: discovering and inventorying every AI agent and MCP server on the endpoint and across the fleet, monitoring their permissions and behavior, alerting on anomalies, and producing the audit trail. It is not a replacement for the OWASP guidance - it is the substrate that makes LLM03, LLM06, LLM02, and LLM10 enforceable rather than aspirational. We describe how that discovery works in Inside Anomity Discovery. The principle is simple and unchanged: you can't govern what you can't see.
Bottom line
The OWASP Top 10 for LLM Applications (2025) is the right shared vocabulary for LLM risk, and Excessive Agency, Prompt Injection, Supply Chain, and Improper Output Handling are the entries that bite hardest once a model can act through tools and MCP servers. Use it as your on-ramp, lean on OWASP's agentic companions for the autonomous-agent specifics, and ground the whole program in a continuous inventory of the agent layer - because every control on the list depends on it.
Frequently asked questions
What is the OWASP Top 10 for LLM Applications?
It is a community-built, vendor-neutral list of the ten most critical security risks in applications built on large language models. It is published by the OWASP GenAI Security Project (an OWASP Foundation project) under a Creative Commons Attribution-ShareAlike 4.0 license, and is the de facto industry reference for securing LLM-backed software.
What changed in the 2025 edition?
The 2025 list (released 18 November 2024) reordered prior risks based on production data and community input, and added two new categories: System Prompt Leakage (LLM07) and Vector and Embedding Weaknesses (LLM08), the latter covering RAG and embedding-store risks. Prompt Injection remains #1; Sensitive Information Disclosure rose to #2.
What are the OWASP Top 10 LLM risks for 2025?
LLM01 Prompt Injection, LLM02 Sensitive Information Disclosure, LLM03 Supply Chain, LLM04 Data and Model Poisoning, LLM05 Improper Output Handling, LLM06 Excessive Agency, LLM07 System Prompt Leakage, LLM08 Vector and Embedding Weaknesses, LLM09 Misinformation, and LLM10 Unbounded Consumption.
Does the OWASP Top 10 for LLM Applications cover AI agents and MCP servers?
Only partially. Excessive Agency (LLM06) is the closest anchor for autonomous tool-use and MCP risk, and several other entries (prompt injection, supply chain, improper output handling) apply directly. But the list is application-centric. OWASP treats deeper autonomous-agent risk in companion documents: 'Agentic AI – Threats and Mitigations' and the 'OWASP Top 10 for Agentic Applications' (released December 2025).
What is Excessive Agency in the OWASP LLM Top 10?
Excessive Agency (LLM06) is when an LLM-based system is granted more functionality, permissions, or autonomy than necessary, letting it take harmful actions through tools, extensions, or plugins in response to ambiguous, hallucinated, or manipulated output. OWASP's mitigations include least privilege, human-in-the-loop for high-impact actions, downstream authorization, and logging and rate-limiting tool invocations.
How is the OWASP LLM Top 10 different from the OWASP Top 10 for Agentic Applications?
The LLM Top 10 ranks risks in applications built on LLMs - prompts, outputs, RAG, supply chain. The OWASP Top 10 for Agentic Applications (released December 2025), along with the 'Agentic AI – Threats and Mitigations' guide, focuses on autonomous-agent-specific threats like shadow agents, agent identity, multi-agent trust, and tool misuse. They are complementary references.
Is the OWASP Top 10 for LLM Applications a compliance standard?
No. It is a prioritized risk reference, not a certifiable standard like ISO 27001 or SOC 2. Security teams use it to structure threat models, design tests, and shape control requirements, and it is increasingly referenced inside broader AI governance and assurance programs.
How do you operationalize the OWASP LLM Top 10 for an agent fleet?
Map each entry to a concrete control and signal: inventory every agent and MCP server (Supply Chain, Excessive Agency), monitor per-tool permissions and enforce least privilege, watch for behavioral anomalies that injection and poisoning produce, validate and sandbox agent output before it drives downstream actions, and keep a tamper-evident audit trail for every tool invocation.




