← Back to blog Guide

GDPR for AI Agents: Lawful Basis, Consent, and Erasure When an Agent Processes Personal Data

Anomity Research Anomity Research · Jun 21, 2026 · 11 min read

TL;DR

GDPR applies fully to AI agents that touch personal data, but consent, purpose limitation, and erasure all break in new ways when an agent acts autonomously and composes tools at runtime.
Consent is fragile because agents drift their own scope mid-task; the operational fix is purpose locks and goal-change gates that re-verify lawful basis before the agent expands.
Erasure (Art 17) is near-infeasible once personal data is baked into model weights or embeddings, so keep PII in deletable, addressable tiers (DBs, caches, vector stores) with deletion APIs, not in model parameters.
Article 30 Records of Processing Activities legally requires a per-agent inventory of what personal data is processed, why, where it flows, and how long it is kept; you cannot answer a DSAR for agents you do not know exist.
MCP servers can become sub-processors (Art 28) and cross-border tool calls trigger Chapter V transfer rules; controller and processor roles must be resolved at runtime, not assumed at design time.
An agent with private data, untrusted input, and external comms (the lethal trifecta) is a standing GDPR breach risk under Art 33/34, so defend structurally, not just by detection.

A click-through consent box assumes you know, in advance, exactly what will happen to someone's data. An AI agent breaks that assumption on its first turn. It decides at runtime which tools to call, which records to read, which external systems to reach, and how far to expand its own objective to get the job done. By the time the work is finished, the agent has often processed personal data in ways no consent screen described, and no one logged that it happened.

That gap is where GDPR for AI agents lives. The regulation has not changed. What has changed is that the entity doing the processing now improvises. Consent, purpose limitation, retention, and the right to erasure were all written for systems whose behavior is fixed at design time. Agents decide at execution time. This guide walks through how each core GDPR obligation actually applies to an autonomous agent and the MCP servers it calls, and the concrete controls a security and privacy team can put in place.

An AI agent is a means of processing, not a legal person. The organization that deploys it is the controller and stays fully accountable under Article 5(2). Autonomy does not soften any obligation; it makes obligations harder to evidence because the decisions that matter, including scope, retention, recipients, and transfers, are made while the agent runs, not when an engineer writes the code.

The articles most relevant to agentic processing are familiar, but they bite differently:

Article	Obligation	Why it strains under agents
5(1)(b)	Purpose limitation	Agents expand their own objective mid-task, drifting beyond the declared purpose
5(1)(e)	Storage limitation	Conversation logs, working memory, and vector embeddings accumulate indefinitely
6 / 9	Lawful basis / special categories	Consent is fragile; agents infer health, ethnicity, and other Art 9 data unprompted
15	Right of access	DSARs require a trace of what the agent did with a person's data
17	Right to erasure	Personal data baked into model weights or embeddings is hard to delete
22	Automated decisions	Agents act without meaningful human oversight by design
28 / Ch V	Processors / transfers	MCP servers become sub-processors; cross-border tool calls are transfers
30	Records of processing	You must inventory every agent's data flows to be compliant at all

Every act of processing needs an Article 6 basis. For agents, the two realistic candidates are consent and legitimate interest, and both carry agent-specific traps.

Consent under GDPR must be specific and informed. A user can consent to an agent "drafting a reply," but the agent, pursuing that goal, may read the full thread, query a CRM, call a summarization API, and store the result in long-term memory. Each step is processing the user never specifically agreed to. The consent was valid at the start of the task and stale three tool calls later.

The operational fix is to treat the agent's objective as an inspectable object and wrap it in *purpose locks and goal-change gates*. When the agent tries to expand its scope, the system pauses and re-verifies that the new scope is still covered by the original lawful basis, or stops and obtains fresh consent before continuing. This is the agentic analog of Article 5(1)(b) purpose limitation: the purpose is enforced at runtime, not just declared in a policy.

Legitimate interest needs the three-step test

Where consent is impractical, legitimate interest under Article 6(1)(f) is often the more honest basis. The EDPB's Opinion 28/2024 (adopted 17 December 2024) is the most recent authoritative reference here. It reaffirms the three-step test you must document for the agent's processing, and stresses that legitimate interest cannot be a default basis:

Identify a real, specific legitimate interest the processing serves.
Necessity: show the processing is actually needed to achieve it, with no less-intrusive alternative.
Balancing: weigh that interest against the rights and reasonable expectations of the data subject.

For agents, the balancing step is the hard one, because a data subject's reasonable expectation rarely includes an autonomous system reaching across systems and inferring new attributes about them.

Article 9 fires by inference

This is the trap teams miss most often. An agent does not need a field labeled "health" to process special-category data. If it auto-tags a discharge note as "endocrinology," classifies a message as relating to a trade union, or infers ethnicity from a name and location, it has produced an Article 9 inference, and almost no agent pipeline fires a special-category consent workflow when that happens. Treat any agent that classifies, tags, or enriches personal data as a potential Article 9 processor and gate it accordingly.

Article 22 is the agent article

Article 22 restricts solely automated decisions with legal or similarly significant effects. Autonomy is the whole point of an agent, which is exactly what Article 22 constrains for high-impact decisions such as approvals, denials, pricing, and escalation. If an agent makes such a decision without meaningful human review, you likely need either explicit consent or a contractual or legal basis, and in all cases the right to obtain human intervention. Mapping which agents make which decisions is part of governing them at all, a theme we develop in governing AI coding assistants across your fleet.

Agents generate a lot of durable personal data: conversation transcripts, tool-call traces, working memory, and vector embeddings of documents. Left alone, all of it persists indefinitely, a direct clash with storage limitation under Article 5(1)(e).

There is also a genuine cross-regulatory tension. The EU AI Act pushes operators of certain systems to log behavior for traceability, while GDPR pushes you to minimize and delete. We keep the AI Act itself out of scope here and cover it in the EU AI Act guide for AI agents; the relevant point for GDPR is only the logging-versus-erasure conflict.

Reconcile by depersonalizing the logs

The pattern that resolves the tension is to log the agent's behavior, including inputs, outputs, tool calls, and confidence scores, without logging who triggered each interaction. Depersonalized behavior logs remain useful for security, debugging, and compliance, and they survive an erasure request because they no longer identify a person. You get traceability and storage limitation at once.

Tiered memory governance

Treat agent memory as distinct tiers, each with its own retention rule and deletion path:

Ephemeral working memory scoped to a single task, discarded on completion.
Long-lived profiles for user preferences and context, with an explicit TTL and a deletion API.
Vector embeddings, semantic indexes of personal data, which need their own deletion-by-subject capability, not just document-level deletes.

Building this kind of disciplined logging and memory hygiene is also the foundation of incident response and access requests, which we cover in the AI agent audit trail and logging guide.

Erasure: keep personal data out of the model

Article 17, the right to erasure, has no AI-specific definition. Deleting personal data from a transactional database is trivial. Deleting it once it has been fine-tuned into model weights is, in practice, near-infeasible without costly retraining, and machine unlearning is still an experimental research area rather than a production control.

The defensible architecture is to keep personal data in deletable, addressable tiers, including databases, caches, and vector stores that expose deletion APIs, and to keep it *out of model parameters*. When you do that, erasure becomes a tractable API contract you can enforce through processor agreements and design into a DPIA, rather than a manual, best-effort cleanup that may simply be impossible. The corollary is to be extremely cautious about fine-tuning a model on raw personal data, because you may be creating an obligation you cannot satisfy. The same data-baked-in problem underlies why traditional controls struggle with agents, which we explore in why traditional DLP fails for AI agents.

MCP servers, processors, and cross-border transfers

Agents rarely act alone. They reach external systems through MCP servers, and that composition has direct GDPR consequences that are easy to overlook.

The MCP server may be a sub-processor

When an agent reaches, say, a CRM through an MCP server, that server may store, cache, or relay enterprise personal data on your behalf, making it a processor or sub-processor under Article 28 and requiring a data processing agreement. Crucially, controller and processor roles in an agentic system must be resolved at runtime, not assumed at design time, because agents compose tools dynamically and a given run may pull in a tool no one anticipated. For the underlying mechanics of how agents authenticate to these servers, see the MCP server security guide and OAuth for MCP servers explained.

Cross-border tool calls are transfers

If a tool the agent calls, a summarization or translation API for example, runs outside the EEA, that call is an international transfer under Chapter V and needs a transfer mechanism such as Standard Contractual Clauses and, where relevant, a transfer risk assessment. An agent that silently routes a paragraph of personal data to a model endpoint in another jurisdiction has performed a transfer no one assessed.

Authentication and accountability

The MCP authorization specification requires OAuth 2.1 with PKCE using the S256 method, supports dynamic client registration, and relies on RFC 9728 protected resource metadata for discovery; OAuth 2.1 bans the implicit grant, and the 2025 spec revisions explicitly ruled out the plain PKCE method. Prefer short-lived, narrowly scoped tokens so a compromised agent credential has minimal blast radius. One open gap matters for GDPR accountability: how the non-human agent identity authenticates to the authorization server is still under-specified, and Article 5(2) accountability ultimately asks *who or what* processed the data. Establishing strong agent identity is its own discipline, covered in non-human identity governance and least privilege for AI agents.

The lethal trifecta is a standing breach risk

Simon Willison's *lethal trifecta* (June 2025) names the combination that turns an agent into an exfiltration channel: access to private data, exposure to untrusted content, and the ability to communicate externally. Any agent with all three can be prompt-injected into leaking personal data, which under GDPR is simultaneously a personal data breach triggering Article 33 (notify the supervisory authority within 72 hours) and potentially Article 34 (notify the affected individuals).

MCP amplifies this because it mixes tools from many sources into a single agent, widening the untrusted-content surface. The crucial point for compliance teams is that the only reliable defense is structural, removing one leg of the trifecta, not detection. An agent that cannot reach untrusted content, or cannot send data outbound, cannot be turned into a leak. For the mechanics, see indirect prompt injection explained and the lethal trifecta in production agents.

Discovery first: Article 30 is the legal hook

Every obligation above shares one precondition. Article 30 requires Records of Processing Activities, a per-system record of what personal data is processed, for what purpose, with which recipients, where it is transferred, and how long it is retained. For agents, this is not paperwork. It is the thing that makes every other GDPR claim provable.

Consider what Article 30 demands once agents are in scope:

You cannot produce a RoPA for an agent you do not know exists.
You cannot answer a one-month DSAR if you cannot trace what an agent did with a person's data.
You cannot do breach-notification data lineage for tool calls you never logged.
You cannot prove a lawful basis for processing you never inventoried.

This is the literal embodiment of *you cannot govern what you cannot see*. Shadow agents and unregistered MCP servers are not just a security problem; they are an unmet legal obligation the moment one of them touches personal data. We make the broader case in AI agents are the new shadow IT.

Concrete controls a security and privacy team can take

Translating the above into an actionable program:

Discover and inventory every agent and MCP server, then map each one's personal-data flows into your RoPA. This is step zero; see how to build an AI agent inventory.
Pin a lawful basis per agent and per processing purpose. Document the three-step test wherever you rely on legitimate interest; flag any agent that infers Article 9 categories.
Implement purpose locks and goal-change gates so scope expansion forces a lawful-basis re-check before the agent proceeds.
Identify Article 22 decisions and ensure meaningful human review where decisions have legal or significant effects.
Adopt tiered memory governance with explicit TTLs and deletion APIs; keep personal data out of model weights so Article 17 stays satisfiable.
Depersonalize behavior logs to reconcile traceability with storage limitation and survive erasure requests.
Resolve controller/processor roles at runtime, sign DPAs with MCP providers acting as sub-processors, and put transfer mechanisms in place for cross-border tool calls.
Break the lethal trifecta structurally for any agent that holds personal data, and wire prompt-injection-driven exfiltration into your breach-detection and Article 33/34 process.
Monitor for scope drift and anomalies at runtime; see runtime monitoring and anomaly detection for AI agents.

Where continuous agent and MCP visibility fits

Most of these controls assume a fact that is rarely true by default: that you actually know which agents and MCP servers exist, what data they touch, and how their permissions and behavior change over time. That visibility is the category Anomity works in, discovering and inventorying every agent and MCP server, monitoring permissions and behavior for purpose drift and Article 22 decisions, alerting on the trifecta pattern that precedes a breach, and producing the depersonalized execution trace that feeds a RoPA and answers a DSAR.

The point is not the tooling. It is that GDPR for AI agents is an evidence problem before it is a policy problem. You can write the best lawful-basis documentation and the cleanest erasure architecture in the world, but if a shadow agent is quietly processing personal data outside your inventory, none of it is provable, and under Article 30, none of it is compliant. Visibility is the precondition for every other control here.

The bottom line

GDPR did not get easier when agents arrived; it got harder to evidence, because the entity doing the processing now improvises at runtime. Consent gives way to purpose locks. Indefinite logs give way to depersonalized traces and tiered memory. Erasure becomes an architectural decision to keep personal data addressable. MCP turns tool composition into a sub-processor and transfer question. And underneath all of it, Article 30 quietly insists that you first know what you have. Start with discovery, pin lawful basis per agent, keep personal data deletable, and break the trifecta, in that order.

Frequently asked questions

Does GDPR apply to AI agents?

Yes. GDPR applies whenever an AI agent processes personal data of people in the EU, regardless of whether the processing is autonomous. The agent is a means of processing; the organization deploying it is the controller and remains fully accountable under Article 5(2). Autonomy does not reduce obligations, it makes them harder to satisfy because scope, retention, and downstream flows are decided at runtime rather than design time.

What is the lawful basis for an AI agent processing personal data?

You need one of the Article 6 bases, most commonly consent (Art 6(1)(a)) or legitimate interest (Art 6(1)(f)). Consent is operationally fragile for agents because their scope drifts mid-task. Legitimate interest requires the three-step test the EDPB reaffirmed in Opinion 28/2024: identify the interest, show necessity, and balance it against data-subject rights. If the agent infers special-category data such as health or ethnicity, Article 9 adds a separate, higher bar that a generic consent flow rarely meets.

Can you fulfill a right-to-erasure request against an AI agent?

Only if the personal data lives in deletable, addressable stores such as databases, caches, and vector stores with deletion APIs. Article 17 has no AI-specific carve-out, but data baked into model weights through fine-tuning is effectively impossible to delete without costly retraining, and machine unlearning remains experimental. The practical stance is architectural: keep personal data out of model parameters so erasure stays a tractable API operation.

Is an MCP server a data processor under GDPR?

It can be. When an agent reaches an external system through an MCP server and that server stores, caches, or transmits enterprise personal data on your behalf, it acts as a processor or sub-processor under Article 28 and needs a data processing agreement. If the MCP server or the tool it calls sits outside the EEA, the call is also an international transfer under Chapter V and needs a transfer mechanism such as SCCs.

How does the EU AI Act logging requirement conflict with GDPR erasure?

The two pull in opposite directions: the AI Act pushes you to log agent behavior for traceability, while GDPR storage limitation (Art 5(1)(e)) and erasure (Art 17) push you to delete. The reconciliation pattern is to log the agent's behavior, including inputs, outputs, tool calls, and confidence, without logging who triggered each interaction. Depersonalized behavior logs stay useful for compliance and survive an erasure request because they no longer identify a person.

What is Article 22 and why does it matter for agents?

Article 22 restricts solely automated decisions that produce legal or similarly significant effects on a person. An agent that approves, denies, prices, or escalates without meaningful human review can fall squarely inside it. It is arguably the most important article for autonomous agents because the whole point of agency is to act without a human in the loop, which is precisely what Article 22 constrains for high-impact decisions.

Why is discovery the first step in GDPR compliance for agents?

Because Article 30 requires Records of Processing Activities, a per-system inventory of what personal data is processed, why, where it goes, and how long it is kept. You cannot produce that record, answer a one-month data subject access request, or trace data lineage for a breach for agents and MCP servers you have not discovered. Every GDPR claim about an agent is unprovable until the agent and its data flows are inventoried.

What is the lethal trifecta and how does it relate to GDPR?

Coined by Simon Willison in June 2025, the lethal trifecta is the combination of access to private data, exposure to untrusted content, and the ability to communicate externally. Any agent with all three can be prompt-injected into exfiltrating personal data, which is simultaneously a GDPR breach triggering Article 33/34 notification. The only reliable defense is structural: remove one leg of the trifecta rather than relying on detecting the attack.