← Back to blog Guide

NIST AI RMF for AI Agents: Govern, Map, Measure, and Manage (2026)

Anomity Research Anomity Research · Jun 21, 2026 · 8 min read

TL;DR

The NIST AI RMF 1.0 (NIST AI 100-1, January 2023) is a voluntary, outcome-based framework built around four functions - GOVERN, MAP, MEASURE, MANAGE - broken into 19 categories and 72 subcategories.
It is descriptive of *outcomes*, not prescriptive of controls: there are no numbered controls, scores, or pass/fail tests. The AI RMF Playbook offers suggested actions per subcategory.
The Generative AI Profile (NIST AI 600-1, July 2024) adds 12 GenAI risk areas and 200+ suggested actions - the closest official extension to agent-relevant risk today.
There is no published NIST 'Agentic Profile'; the Cloud Security Alliance released a draft community extension in late 2025 to address autonomous-agent gaps.
For AI agents and MCP servers, the binding constraint is GOVERN: you cannot govern, map, measure, or manage what you have not first discovered and inventoried.
Continuous discovery, permission monitoring, behavioral anomaly detection, and a queryable audit trail are what turn AI RMF outcomes into evidence you can show a CISO or auditor.

The NIST AI Risk Management Framework (AI RMF 1.0), published as NIST AI 100-1 on January 26, 2023, is the document most U.S. enterprises now reach for when a board or auditor asks how they govern AI. It was directed by the National AI Initiative Act of 2020 and developed by NIST through an open, multi-stakeholder process. It is voluntary, sector-agnostic, and use-case-agnostic, and its stated purpose is to help organizations manage risk to individuals, organizations, and society across the full AI lifecycle while promoting *trustworthy AI*.

The framework was written for the AI most teams had in 2023: models that classify, predict, and generate. It was not written for software that acts - autonomous agents that call tools, chain steps, delegate to other agents, and reach into your file systems and APIs through MCP servers. The good news is that the AI RMF's structure holds up well for agents; the hard part is that agents break the one assumption the framework quietly depends on - that you know which AI systems you have. This guide walks the framework as it actually is, then maps it to the agentic reality without inventing precision the standard does not provide.

What the NIST AI RMF actually is

The AI RMF is outcome-based and descriptive, not prescriptive. This is the single most important thing to understand before you try to use it. Unlike a control catalog such as NIST SP 800-53, it does not give you numbered controls, maturity scores, or pass/fail tests. It describes outcomes you should be able to demonstrate and leaves the *how* to you. A companion document, the AI RMF Playbook, offers suggested actions, documentation, and references for each outcome - but those are suggestions, not requirements.

The framework comes in two parts. Part 1 frames AI risk and defines the seven characteristics of trustworthy AI. Part 2 is the Core - the four functions, categories, and subcategories that organize the work. The whole thing is explicitly iterative, not a linear checklist you complete once.

The seven trustworthy-AI characteristics

Part 1 names seven properties that, together, define trustworthy AI. They are not abstract: the MEASURE function maps subcategories directly onto them, so they become the things you actually evaluate.

Valid and reliable - the foundational characteristic the others build on
Safe - does not, under defined conditions, endanger life, health, property, or environment
Secure and resilient - withstands and recovers from adversarial events
Accountable and transparent - actions are attributable and visible
Explainable and interpretable - outputs and mechanisms can be understood
Privacy-enhanced - respects norms and values around autonomy and data
Fair, with harmful bias managed - equity considerations are addressed

The Core: Govern, Map, Measure, Manage

The Core organizes work into four functions → 19 categories → 72 subcategories. GOVERN is cross-cutting and is meant to inform the other three; MAP, MEASURE, and MANAGE are roughly lifecycle-ordered - establish context, then measure, then respond - but you cycle through them continuously rather than once.

Function	Categories / Subcategories	What it asks you to achieve
GOVERN	6 / 19	A culture of risk management: policies and processes (G1), accountability and roles (G2), human oversight and diversity (G3), a safety-first documenting culture (G4), engagement with affected actors (G5), and third-party / supply-chain risk for acquired models and components (G6).
MAP	5 / 18	Establish context and identify risk: intended purpose and stakeholders (M1), categorize the system and its tasks (M2), capabilities and expected costs/benefits (M3), risks of all components including third-party and IP (M4), and characterize impacts on people and society (M5).
MEASURE	4 / 22	Analyze, benchmark, and monitor risk: select methods and metrics (M1), evaluate the trustworthy characteristics including red-teaming/TEVV (M2), track emergent and unanticipated risk over time (M3), and gather feedback on measurement efficacy (M4).
MANAGE	4 / 13	Prioritize and act: respond to assessed risks - mitigate, transfer, avoid, accept (M1); maximize benefit including override/deactivation (M2); manage third-party and pre-trained model risk (M3); document, monitor, and run incident response post-deployment (M4).

Two profiles extend the Core. The Generative AI Profile (NIST AI 600-1), published July 26, 2024 under Executive Order 14110, identifies 12 GenAI risk areas - including confabulation, data privacy, information security, information integrity, value chain and component integration, harmful bias, and human-AI configuration - and maps 200+ suggested actions to the four functions. It is the closest official extension to agent-relevant risk today. Notably, a formal NIST 'Agentic Profile' does not yet exist; the Cloud Security Alliance published a draft community 'Agentic AI Profile' in late 2025 to fill the autonomy gap, but it is not a NIST standard.

Why agents and MCP servers stress the framework

AI RMF assumes a known system you are deploying deliberately. Agentic AI violates that assumption in three ways. First, agents arrive bottom-up: a developer installs a coding agent, an analyst wires up an MCP server, and neither shows up in any inventory. Second, agents act - they hold credentials, call tools, and chain multi-step operations whose blast radius is hard to predict. Third, agents delegate, diffusing accountability across human and machine identities. We have written about this dynamic at length in Why AI Agents and MCP Servers Are the New Shadow IT and in MCP Server Security: The Complete Guide.

None of this requires a new framework. It requires reading the existing one honestly: every function below has an agentic interpretation, and most of them quietly assume a capability - visibility into the agent fleet - that few organizations actually have.

Mapping the functions to agentic risk

AI RMF item	Agentic / MCP risk	What you have to monitor
GOVERN 1, 2, 6	Shadow agents and unvetted MCP servers operating with no policy, owner, or record	A live inventory of every agent and MCP server on the fleet, including third-party/acquired ones
GOVERN 2 + delegation	Authority delegated across multi-agent and tool chains, diffusing accountability	Which human or agent identity authorized which action - a delegation/accountability record
MAP 2, 3, 4	Excessive agency: an agent's tool access and MCP permissions exceed its task	Each agent's tools, MCP permissions, and blast radius (consequence scope, reversibility)
MAP 5 + GenAI data privacy/info-security	Sensitive-data exposure as agents read files, databases, and APIs	Which data sources each agent and MCP server can actually reach
MEASURE 2, 3	Behavioral drift, permission escalation, and emergent risk over time	Runtime telemetry: action velocity, permission changes, delegation depth, anomalies
MEASURE + GenAI info-integrity/security	Prompt injection arriving via tool outputs, not the user prompt	Tool inputs and outputs across the MCP boundary
MANAGE 1, 2, 4	A compromised or runaway agent that needs to be stopped	An override/deactivation path, drift correction, and principled decommissioning
MANAGE / GOVERN (all)	No queryable record to prove any of the above to an auditor	A continuous audit trail of every agent, MCP server, permission, and anomalous action

Operationalizing it for a security team

Treat the four functions as a continuous loop you run against the fleet, not a binder you fill out once. Here is a pragmatic sequence that respects the framework's intent while staying grounded in what agents actually do.

1. GOVERN - establish ownership and discover the fleet

GOVERN 1, 2, and 6 are unsatisfiable on paper alone. Before you can write a policy for AI agents, name an owner, or assess a third-party MCP server, you have to know they exist. Start with discovery: enumerate every agent runtime, CLI, IDE extension, and MCP server on endpoints and in CI. Our walkthrough of what surfaces when you scan AI configs shows how much of this is invisible by default. Assign each discovered agent an owner and a record - that record is the spine everything else hangs from.

2. MAP - classify tools, permissions, and blast radius

For each agent, satisfy MAP 2–4 by classifying its tasks, its tool and MCP permissions, and the data it can touch. The practical question is excessive agency: does this agent hold more authority than its job requires, and how reversible are its actions? Coding agents and CLIs are the sharpest case because they execute code and hold tokens - see Securing AI Coding Agents and CLIs. Third-party MCP servers belong here too, under MAP 4 and GOVERN 6, because an acquired tool provider can carry supply-chain risk you inherit; AI Supply-Chain Attacks: A Defender's Guide covers that threat model.

3. MEASURE - monitor behavior and emergent risk

MEASURE 2 and 3 ask you to evaluate trustworthiness and track *unanticipated and emergent* risk over time. For agents this means runtime behavioral telemetry - action velocity, permission escalation, delegation depth - and watching tool inputs and outputs for the agentic flavor of prompt injection, where adversarial instructions arrive through a tool result rather than the user prompt. That class of attack is documented in our coverage of MCP tool poisoning. MEASURE is where a static inventory becomes a living one.

4. MANAGE - respond, contain, and decommission

MANAGE 1, 2, and 4 turn measurements into action: prioritize risks, keep an override or deactivation path (MANAGE 2's kill-switch outcome), correct behavioral drift, and decommission agents cleanly. For a runaway or compromised agent - the scenario in multi-agent prompt injection and credential theft - you need to know which identity is affected, what it can reach, and how to revoke it fast. None of that works without the visibility you built in GOVERN and MEASURE.

Where continuous agent and MCP visibility fits

Read across the four functions and a pattern emerges. GOVERN needs an inventory. MAP needs permission and data-reach classification. MEASURE needs runtime behavioral telemetry. MANAGE needs an identity-aware response path and a record to act on. And across all of them, NIST asks for documentation, monitoring, and incident tracking that you can produce on demand. For traditional software you get most of that from existing asset, IAM, and SIEM tooling. For autonomous agents and MCP servers, those tools largely do not see the layer at all.

That gap - continuous discovery, permission monitoring, behavioral anomaly detection, and a queryable audit trail for every agent and MCP server - is the category Anomity works in, and the reason we describe our discovery approach in Inside Anomity Discovery. The framing matters more than any product: the AI RMF is evidence-able only when the agent layer is visible. If you are governing coding assistants specifically, Governing AI Coding Assistants Across Your Fleet maps these same functions to that narrower problem.

Common mistakes

Treating it as a control checklist. There are no numbered controls or scores in the AI RMF - do not invent them or cite control numbers that do not exist.
Skipping GOVERN discovery. Writing AI policy before you have an agent inventory produces a document that describes systems you cannot see.
Ignoring third-party MCP servers. GOVERN 6 and MAP 4 explicitly cover acquired components; unvetted MCP servers are exactly the supply-chain risk they target.
Running it once. The framework is iterative. Agents change behavior, permissions, and tooling continuously, so MEASURE has to be continuous too.
Confusing the GenAI Profile with an agent profile. NIST AI 600-1 helps, but no official NIST agentic profile exists yet; do not cite one as a NIST standard.

Bottom line

The NIST AI RMF is a strong, durable scaffold - voluntary, outcome-based, and broad enough to absorb agentic AI without amendment. Its four functions map cleanly onto the work of governing an agent fleet: discover and own (GOVERN), classify permissions and blast radius (MAP), monitor behavior and drift (MEASURE), and respond and decommission (MANAGE). The framework's only blind spot is the one it inherited from a pre-agentic world: it assumes you can already see your AI. For autonomous agents and MCP servers, earning that visibility is the first and load-bearing step. You can't govern what you can't see - and with agents, you usually can't see it yet.

Frequently asked questions

Is the NIST AI RMF mandatory?

No. The AI RMF 1.0 is a voluntary, rights-preserving framework, not law. But it is referenced across U.S. federal AI policy and crosswalks to ISO/IEC 42001, ISO/IEC 23894, and the NIST SP 800-53 ecosystem, so most enterprises adopt it as a governance baseline and as audit evidence.

What are the four functions of the NIST AI RMF?

GOVERN, MAP, MEASURE, and MANAGE. GOVERN is cross-cutting (culture, accountability, policy, third-party risk). MAP establishes context and identifies risk, MEASURE analyzes and monitors it, and MANAGE prioritizes and responds. Together they hold 19 categories and 72 subcategories.

Does the NIST AI RMF define specific controls or scores?

No. The AI RMF is outcome-based and descriptive - there are no numbered controls, maturity scores, or pass/fail criteria. The companion AI RMF Playbook offers suggested actions, documentation, and references for each subcategory, but you choose how to achieve each outcome.

Is there a NIST AI RMF profile for AI agents?

Not as a published NIST standard. The closest official extension is the Generative AI Profile (NIST AI 600-1, July 2024). The Cloud Security Alliance published a draft community 'Agentic AI Profile' in late 2025 to address gaps around autonomy, delegation, and tool use, but it is not a NIST document.

How does the NIST AI RMF apply to MCP servers?

MCP servers are third-party AI components, so they fall squarely under GOVERN 6 (supply-chain and acquired-component risk) and MAP 4 (component risk). Satisfying those outcomes requires an inventory of every MCP server, the permissions it holds, and the data sources it can reach - which most organizations do not have today.

What is the Generative AI Profile (NIST AI 600-1)?

It is a NIST profile published in July 2024 that tailors the AI RMF to generative AI. It identifies 12 GenAI risk areas - such as confabulation, data privacy, information security, information integrity, and value-chain/component integration - and maps 200+ suggested actions to the four functions.

How do you operationalize the NIST AI RMF for an agent fleet?

Start with discovery and inventory to satisfy GOVERN, classify each agent's tools and permissions to satisfy MAP, monitor runtime behavior and permission drift to satisfy MEASURE, and wire alerts into incident response with deactivation and decommissioning paths to satisfy MANAGE. Keep a continuous audit trail as the evidence layer across all four.

How does the NIST AI RMF relate to ISO/IEC 42001?

They are complementary. ISO/IEC 42001 is a certifiable AI management-system standard; the NIST AI RMF is a voluntary risk framework. NIST publishes a crosswalk between the AI RMF and ISO/IEC 42001 (and references ISO/IEC 23894 and SP 800-53), so outcomes you document for one can usually be reused as evidence for the others.