← Back to blog Guide

AI Supply-Chain Attacks: A Defender's Guide for Security Teams (2026)

Anomity Research Anomity Research · Jun 12, 2026 · 12 min read

TL;DR

AI supply-chain attacks target two layers at once: the registries your dependencies come from, and the .cursorrules and CLAUDE.md files your AI coding agents read as trusted context.
Scale is the story. Sonatype counted more than 454,600 new malicious packages in 2025, with over 99% on npm - fast detection narrows the install window but never closes it.
The decisive shift is when code runs: Shai-Hulud 2.0 moved theft to preinstall (so a failed install still executes), and the Miasma 'Phantom Gyp' worm hides execution in a 157-byte binding.gyp that script-field scanners miss.
There is usually no CVE to patch - TrapDoor, Shai-Hulud 2.0, and PolinRider are tracked by campaign name, so remediation is operational: inventory, rotate secrets, and govern what agents do.
AI agents inherit the developer's reach. A poisoned CLAUDE.md with zero-width Unicode instructions can steer an agent into reaching SSH keys, cloud credentials, and GitHub tokens that EDR and DLP never flag.
The durable control is inventory plus a runtime hook: classify the eight AI artifact types per endpoint, then allow/deny/log each tool call before it runs, recorded to a 90-day audit trail.

You decided to let AI coding agents into your developer fleet, and that decision quietly changed your supply-chain risk. AI supply-chain attacks no longer stop at a backdoored package - they now target the .cursorrules and CLAUDE.md files your agents read as trusted context, so the agent itself becomes the delivery mechanism. The scale makes this concrete: Sonatype counted more than 454,600 new malicious open-source packages in 2025, with over 99% on npm, and the first self-replicating npm worms proved that this malware can now propagate on its own. The defender's question for 2026 is not which package to patch - most of these campaigns have no CVE - but which endpoints run agents and CLIs that install code or read project config, and what those agents are allowed to do. This guide lays out the decision framework and links the campaign advisories behind it; if you want the runtime control first, jump to runtime governance.

The pattern across every recent campaign is the same: trust in code or configuration the agent never wrote, abused to run attacker logic inside an environment full of secrets. What changed in 2025 and 2026 is *where* and *when* that logic runs - preinstall instead of postinstall, a native-build hook instead of a script field, an agent's context file instead of a binary. We will walk through what these attacks are, why they land on the agent layer, how the major campaigns work, and how to select and operate a control that holds the boundary. The eight AI artifact types Anomity inventories - agents, MCP servers, extensions, skills, plugins, secrets, hooks, and CLIs - are the surface every one of these campaigns crosses, and you can see that surface in fleet visibility.

What counts as an AI supply-chain attack in 2026?

An AI supply-chain attack is a compromise that rides the dependencies, registries, and configuration your AI agents and developers trust, and that increasingly targets the AI artifact layer itself. It splits into two families that often combine in one campaign:

Registry-borne package malware - a malicious or trojanized package on npm, PyPI, or Crates.io that runs code at install or build time. This is the classic vector, and the eight campaigns linked here all use it as the entry point.
Agent-context poisoning - planting instructions in the files an AI coding assistant reads as authoritative, such as .cursorrules and CLAUDE.md, so the agent itself takes the hostile action. TrapDoor is the clearest example, hiding instructions in zero-width Unicode.
Identity and history abuse - using stolen npm or GitHub credentials to backdoor further packages (the worm pattern), and rewriting Git history to falsify commits so the log a defender trusts no longer reflects what is on disk (PolinRider).

The defining property is that there is usually nothing to patch. TrapDoor, Shai-Hulud 2.0, and PolinRider are tracked by campaign name and by the specific package-and-version pairs they backdoored, not by a fixed version you roll forward to. That is why the response shifts from patching to inventory and governance at the agent boundary, and why the AI security framework treats the agent layer as its own control domain.

Why do these attacks land on the agentic-endpoint layer?

Because the path to your secrets now runs through an AI agent more often than through a human at a terminal. A coding agent that scaffolds a project, adds a dependency, or runs a build invokes npm install or pip install, and the moment a transitively backdoored package is in the tree, the payload fires inside the agent's process with its environment and reachable secrets. The agent inherits the developer's reach: SSH keys, cloud credentials, GitHub tokens, and - in the crypto and DeFi tooling TrapDoor targeted - wallet keystores, all coexisting on the same machine.

This is hard to see from the controls you already run. The install looks like a legitimate build step to network tooling, the agent process looks legitimate to EDR, and DLP sees nothing at rest because the stealer reads tokens from the live environment and pushes them through GitHub's normal API surface, Gists, or cloud API calls. A poisoned CLAUDE.md looks like an ordinary project file - none of those controls classify agent-config artifacts or decode zero-width Unicode. The question is not whether one package is patched; it is which endpoints run AI agents and CLIs that install code or read project config, and what those agents do with the secrets they can reach. That boundary is what runtime governance holds, and the gaps between it and your other tooling are mapped in the comparison.

How do the major 2025-2026 campaigns actually work?

Each campaign refines the same idea - run attacker code through a trusted install or config path - with a different evasion. The table summarizes the six advisories this guide hubs; follow each link for the full breakdown and the per-fleet checklist.

Campaign	Ecosystem	Scale (real numbers)	Defining technique
Shai-Hulud 2.0	npm	796 packages, ~20M weekly downloads, ~1,200 orgs	Moved theft to preinstall; self-replicates with no C2; home-directory wiper fallback
Mini Shai-Hulud	npm + PyPI	42 @tanstack packages / 84 versions; 170+ pkgs, 518M+ downloads	CVE-2026-45321 (CVSS 9.6); `optionalDependencies` to orphan commit, `prepare` hook exits non-zero to look broken
Miasma 'Phantom Gyp'	npm	57 packages / 286+ versions in under two hours	Abuses a 157-byte `binding.gyp` native-build hook, bypassing `package.json` script scanners
Hades	PyPI	37 wheels across 19 packages	`-setup.pth` startup hook runs a Bun stealer on every Python invocation*, no import needed
PolinRider	npm + VS Code	1,951 repos across 1,047 owners (2.9x in 5 weeks)	DPRK/Lazarus; appends obfuscated JS after valid config; rewrites Git history
TrapDoor	npm + PyPI + Crates.io	34+ packages / 384+ versions from May 19 2026	Plants poisoned `.cursorrules` and `CLAUDE.md` with zero-width Unicode to steer AI agents

Read across the rows and three trends stand out. First, the execution trigger keeps moving to earlier and less-watched lifecycle points - postinstall, then preinstall, then the binding.gyp native-build hook, then a .pth startup hook that needs no import at all. Second, the evasion increasingly defeats review, not just scanners: PolinRider appends code after valid config and rewrites history, Mini Shai-Hulud's prepare hook exits non-zero to mimic a broken optional dependency. Third, TrapDoor crosses into a new surface entirely - the agent's own context files. The MCP equivalent of this artifact-layer risk is covered in the MCP server security guide.

What makes the agent-config poisoning vector different?

Registry malware abuses code; agent-config poisoning abuses *instructions*. When TrapDoor plants a CLAUDE.md or .cursorrules file with zero-width-encoded text, there is no executable payload in the file - there is a sentence the agent reads as project context and acts on. The agent, not a dropped binary, performs the exfiltration, which means signature-based detection of the file contents finds nothing and the action blends into the agent's legitimate behavior.

The instruction is invisible to humans (zero-width Unicode renders as nothing in most editors) but fully legible to the model.
The action runs with the agent's privileges and reach, so it touches the same secrets the agent already could.
There is no version to fix - you remediate by scanning for unexpected config files and zero-width characters, and by gating what the agent is allowed to do.
The same file can persist across sessions and propagate when the repo is shared, so one poisoned CLAUDE.md can re-infect every clone.

This is why classifying agent-config artifacts as part of the eight AI artifact types matters. A control that only watches package installs misses the entire TrapDoor pattern; a control that inventories and classifies the files agents read can flag an unexpected CLAUDE.md the way it flags an unvetted MCP server. The policy mapping is in the agentic AI governance guide.

Which secrets are actually at risk when an endpoint is hit?

More than developers usually assume, because the stealers enumerate broadly and validate before exfiltrating. The Hades wave alone targets GitHub and GitHub Actions runner secrets; npm, PyPI, RubyGems, JFrog, CircleCI, and Anthropic tokens; AWS, GCP, and Azure credentials; Kubernetes and Vault secrets; and developer artifacts including .env, .npmrc, .pypirc, SSH keys, Docker configs, and Claude/MCP configs - plus cross-platform memory scrapers that catch credentials held only in process memory. TrapDoor's npm payload validates stolen AWS and GitHub credentials with live API calls before exfiltration, so it filters for keys that actually work.

The practical implication: when an endpoint is touched, treat every secret reachable from it as exposed and rotate it, rather than reasoning about whether a specific file was read. This is also why on-endpoint redaction matters - Anomity collects metadata only and redacts secrets such as SSH keys, cloud credentials, and GitHub tokens on the endpoint before anything leaves it, so a payload reading the live environment has less centralized plaintext to find, and so does Anomity itself. The audit trail records secret access without storing the secret, as the outcomes section describes.

How should I select a control for AI supply-chain attacks?

Use selection criteria that match the actual failure mode: no CVE, fast campaigns, agent-layer surface. The decision matrix below contrasts the common approaches against the requirements these campaigns impose.

Requirement (set by the campaigns)	Dependency scanner / SCA	EDR / DLP	Agentic-endpoint governance (Anomity)
Inventory which endpoints run AI agents, CLIs, and read agent-config files	No	Partial (process only)	Yes - eight artifact types per endpoint
Classify `.cursorrules` / `CLAUDE.md` as an instruction surface	No	No	Yes
Stop an action when there is no CVE to patch	No (needs a known-bad version)	Heuristic only	Yes - allow/deny/log at the hook
Decide on each agent tool call before it runs	No	No	Yes - e.g. Claude Code PreToolUse
Answer scope from a record after a campaign	Partial (manifest diff)	Partial	Yes - queryable 90-day audit trail
Keep secrets out of a central store	N/A	Varies	Yes - metadata only, on-endpoint redaction

The point is not that scanners or EDR are wrong - Anomity complements, not replaces, your Network, EDR, DLP, and GRC stack. It is that none of them were built to inventory the agent layer or to decide on an agent's individual tool call. A scanner tells you a known-bad version exists after it is catalogued; it cannot stop the pre-detection install that TrapDoor and Miasma both demonstrated. The criteria that close the gap are inventory of the AI artifact layer, a decision point on each tool call, and a durable record. See where these land against your existing tools in the comparison and the docs.

What are the edge cases that break naive defenses?

The failed install. Shai-Hulud 2.0's preinstall script runs before installation completes and even when it fails, so 'the install errored out, we're fine' is false. Gate the install command, not its success.
The non-script execution path. Miasma's binding.gyp runs code through node-gyp's native build, invisible to scanners watching only preinstall and postinstall fields in package.json.
No import required. Hades' *-setup.pth file runs at Python interpreter startup, so the payload fires on every python, pytest, build, or CI job - the dependency never has to be imported.
The trusted commit log. PolinRider rewrites Git history to falsify commits, so reviewing the log is not enough; the code on disk can differ from what the history shows.
The invisible instruction. TrapDoor's zero-width Unicode in CLAUDE.md is legible to the agent and invisible to the reviewer, so reading the file by eye misses it.
The 'broken' dependency. Mini Shai-Hulud's prepare hook deliberately exits non-zero so the malicious step looks like a broken optional dependency in install logs.

Every one of these defeats a check that assumes the attack is visible at the point you look. The common thread is that the only reliable observation point is the action itself - the agent's tool call as it is about to run - which is exactly where governance at the hook sits, the boundary detailed in how it works.

How does Anomity govern AI supply-chain attacks?

With no version to roll forward, the durable control is to inventory the agents, CLIs, and config artifacts on each endpoint and govern what those agents do with the secrets they can reach. Anomity does this in three concrete steps.

1. Inventory and classify the artifact layer

Anomity inventories the eight AI artifact types - AI agents, MCP servers, extensions, skills, plugins, secrets, hooks, and CLIs - on every managed endpoint, then classifies them. It captures which coding-agent and CLI surfaces can run npm install, pip install, or cargo builds, and it surfaces poisoned agent-config artifacts - unexpected .cursorrules and CLAUDE.md files - that DLP and EDR do not classify. Concretely, after a campaign like Miasma you can list every endpoint with an npm-capable agent and confirm none pulled a backdoored version; after TrapDoor you can find every developer machine carrying a CLAUDE.md it should not have. This is the fleet visibility layer.

2. Decide at the hook: allow, deny, or log

On agents that expose a hook - for example the Claude Code PreToolUse event - Anomity evaluates each tool call against your policy and returns allow, deny, or log before the call runs. An agent that has read a poisoned CLAUDE.md and is about to reach for a cloud token, push to a Gist, or run an unexpected install can be denied at the boundary. A package whose preinstall would fire on a failed install never gets to run because the install command itself is gated. This is the runtime governance control that works when there is no patch and fast detection still leaves a pre-detection window.

3. Keep a queryable record

Anomity logs the tool calls and secret access an agent performs, recorded against a queryable 90-day audit trail, with decisions routed to SIEM, Slack, email, or Jira. When the next campaign lands you can answer which endpoints ran the affected installs, which carried poisoned config files, and what those agents were allowed to touch - from a record, not a guess. Anomity collects metadata only, redacts secrets on the endpoint, and is SOC 2 Type II; it complements your Network, EDR, DLP, and GRC tooling rather than replacing it. The record and its outputs are the outcomes layer.

You can't govern what you can't see.The Anomity principle

What should I do across my fleet this week?

Inventory every endpoint and pipeline that runs AI coding agents and CLIs capable of npm, pip, or cargo installs, and cross-check against the affected versions in the six campaign advisories linked above.
Scan every repository and developer machine for unexpected .cursorrules and CLAUDE.md files, and check them for zero-width Unicode characters that hide instructions from human reviewers.
Rotate every developer and cloud secret reachable from a touched endpoint - SSH keys, AWS/GCP/Azure credentials, GitHub and npm tokens, Vault and Kubernetes secrets - and assume preinstall or .pth execution means the secret was read.
Pin dependencies to known-good versions and disable lifecycle scripts where possible, so a preinstall, binding.gyp, or .pth payload cannot run on the next install.
Gate install, network, and credential-reading commands at a hook with allow/deny/log so action on injected instructions is stopped before it runs.
Confirm every tool call and secret access is written to a 90-day audit trail and routed to your SIEM, so you can answer scope when the next cross-ecosystem campaign lands.
Use the AI security framework and the agentic AI governance guide to turn this checklist into standing policy.

The framework holds across every campaign in this cluster: inventory the AI artifact layer, decide on each agent tool call at the hook with allow/deny/log, and keep a queryable record - because these are AI supply-chain attacks with no CVE to wait on, fast enough that detection alone leaves a window, and now aimed at the config files your agents trust. Start by reading the six campaign advisories to scope your exposure, then operationalize the control. To see Anomity inventory the agent layer and govern tool calls across your fleet, request early access.

Frequently asked questions

What is an AI supply-chain attack?

It is a supply-chain compromise that targets the software your AI agents and developers depend on, and increasingly the AI artifact layer itself. The classic vector is a malicious package on npm, PyPI, or Crates.io that runs code at install. The newer vector is poisoned agent context: campaigns like TrapDoor plant .cursorrules and CLAUDE.md files with hidden instructions an AI coding assistant reads as trusted project context. Both abuse trust in code or configuration the agent never wrote, and both let attacker code execute inside the developer's environment with reach to the same SSH keys, cloud credentials, and tokens the agent can already touch.

Why are AI coding agents a bigger supply-chain risk than a human at a terminal?

Because the agent multiplies both the trigger and the trust. Modern coding agents scaffold projects, add dependencies, and run builds, so they invoke npm install and pip install far more often than a person would, widening the chance of pulling a backdoored package. They also read project config - CLAUDE.md, .cursorrules - as authoritative instructions, so a poisoned file steers them directly. The agent runs with the developer's environment and reachable secrets, so a transitively backdoored package fires inside the agent's process. That is the boundary runtime governance is built to hold, because the install looks legitimate to network tooling and EDR.

If there is no CVE, what do I actually do?

Treat remediation as operational, not patch-driven. TrapDoor, Shai-Hulud 2.0, and PolinRider are tracked by campaign name and by specific package-and-version pairs, so there is no fixed version to roll forward to. First, inventory which endpoints run AI agents and CLIs capable of installs, and which carry unexpected .cursorrules or CLAUDE.md files. Second, rotate every developer and cloud secret a touched endpoint could reach - assume preinstall execution means the secret was read. Third, pin dependencies to known-good versions and disable lifecycle scripts where possible. Then govern the tool calls those agents make at a hook with allow/deny/log so the next campaign is stopped before it runs.

What does 'preinstall' versus 'postinstall' change for defenders?

It removes the safety net of a failed install. The original Shai-Hulud ran during the postinstall lifecycle step; Shai-Hulud 2.0, identified November 24 2025, injects a preinstall script, so the payload runs before installation completes and even when installation fails. A failed npm install no longer protects you because the code has already executed. The Miasma 'Phantom Gyp' worm goes further, hiding execution in a 157-byte binding.gyp native-build hook that bypasses scanners watching only package.json script fields. The practical takeaway: do not assume a broken or aborted install was harmless, and gate install commands at the agent boundary rather than relying on the install succeeding.

How do attackers poison AI coding assistants specifically?

By writing instructions into the files agents read as context. The TrapDoor campaign, beginning May 19 2026, plants .cursorrules and CLAUDE.md files containing hidden instructions encoded with zero-width Unicode characters - invisible in most editors but read by the agent as project context. The agent can then be manipulated into hostile actions such as credential exfiltration, running inside the developer's environment with reach to SSH keys, cloud credentials, GitHub tokens, and wallet keystores. Because the file looks like an ordinary project file, DLP and EDR do not classify it or decode the zero-width Unicode. The defense is to inventory and classify agent-config artifacts as part of the eight AI artifact types, not treat them as inert text.

Don't my existing EDR, DLP, and network tools already cover this?

They cover the layers they were built for, and the AI artifact layer is not one of them. An npm install looks like a legitimate build step to network tooling; the agent process looks legitimate to EDR; DLP sees nothing at rest because credential stealers read tokens from the live environment and exfiltrate through normal GitHub API, Gist, and cloud API traffic. None of them classify a CLAUDE.md as an instruction surface or decode zero-width Unicode. Anomity complements - it does not replace - your Network, EDR, DLP, and GRC tooling, covering the agent, MCP, extension, skill, plugin, secret, hook, and CLI layer those controls never inventoried. See the comparison for where the seams are.

How fast are these campaigns, and does fast detection solve it?

Fast, and no. The Miasma worm compromised 57 npm packages across 286+ versions in a rolling campaign lasting under two hours. Socket detected new TrapDoor releases in a median of 5 minutes 27 seconds, with the fastest at 58 seconds - yet pre-detection installs still occurred. Detection narrows the window without closing it. That is why the durable control is not faster scanning alone but governance at the point of action: even if a backdoored package reaches an endpoint, the agent's attempt to run an unexpected install, push to a Gist, or reach a cloud token can be denied at the runtime hook before it executes, and recorded to the 90-day audit trail.

What is the relationship between this guide and the individual campaign advisories?

This pillar is the hub. It frames the decision - inventory, govern at the hook, keep the record - that applies across every campaign, and links to the detailed advisories for each one: the Hades PyPI .pth campaign, the Miasma 'Phantom Gyp' npm worm, the PolinRider DPRK campaign, Shai-Hulud 2.0, the Mini Shai-Hulud wave (CVE-2026-45321), and TrapDoor. For the MCP server angle, see the sibling MCP server security guide.