← Back to blog Research

Auditing Claude Code Across a Fleet: Settings, Hooks, MCP Servers, Plugins, and Skills

Anomity Research Security Researcher, Anomity · Jun 6, 2026 · 7 min read

When we inventory Claude Code across a fleet of managed laptops, the most interesting findings are not in any one engineer's setup. They are in the spread. Two developers on the same team, in the same repo, run the same agent with materially different permission rules, MCP servers, and hooks, because Claude Code reads configuration from several layers and merges them per directory. The result is a per-endpoint security posture that no inventory built for browsers or packages was designed to see.

This is a research-findings piece, not an advisory. We are documenting the recurring config patterns we observe and what each tells you about real risk. We covered why this layer needs inventorying at all in securing AI coding agents and CLIs; here we go deeper into the specific Claude Code artifacts and where they drift. Every behavior below is documented in Claude Code's own how-it-works and permissions references.

Where the configuration actually lives

Claude Code resolves settings through a precedence chain. From lowest to highest authority: user settings in ~/.claude/settings.json, shared project settings in .claude/settings.json, local project settings in .claude/settings.local.json, command-line arguments, and finally managed (enterprise) settings no lower level can override. Project-scoped MCP servers live in .mcp.json; user and local MCP scopes sit in ~/.claude.json with per-project trust state. Auditing one laptop means reconciling four or five files, and the effective policy is whatever wins the merge.

The practitioner consequence: reading .claude/settings.json out of a Git repo tells you what the team intended, not what runs. The local and user files, both untracked, are where drift accumulates, and a package scanner reading checked-in config misses the policy that actually governs tool calls on the endpoint.

settings.json allow/deny drift

The permission system is a tiered allow / ask / deny model, evaluated deny-first: deny rules are checked before ask, then allow, and rule specificity does not change that order. A deny at any scope blocks the call regardless of an allow elsewhere. So the audit question is never just "what is allowed" but "what is the merged deny set, and did a local file quietly widen the allow list past it?" The patterns we see most often:

Bare Bash on the allow list. A bare tool name allows every shell command. We see it added to .claude/settings.local.json to stop the prompts, turning the most powerful tool into a no-prompt path. Argument-constraining patterns like a Bash(curl ...) rule scoped to one domain are fragile, options before the URL, a different protocol, a redirect, or a shell variable all slip past.
Process-wrapper and runner gaps. Claude Code strips wrappers like timeout and nice before matching, but environment runners such as npx, docker exec, and devbox run are not, so Bash(devbox run *) matches whatever follows run, including a destructive command.
Read/Edit deny rules that look protective but are not absolute. Read(.env)-style deny rules cover Claude's file tools and recognized Bash commands like cat and sed, but not a Python or Node script that opens the file itself, defense-in-depth, not a boundary.
defaultMode set to bypassPermissions or acceptEdits. A user file flipping the default mode changes the safety story for every session on that machine.

None of this is exotic. It is the ordinary entropy of developers removing friction one prompt at a time. The point of the audit is to make the merged result legible: which endpoints widened past the team baseline, and on which tool.

MCP servers from registries

MCP servers are where the inventory gets genuinely external. A server can be added at local, project, or user scope; project servers ship in .mcp.json and travel with the repo, user-scope servers follow the developer across every project. Many are pulled from public registries and marketplaces, and a stdio server is just a command Claude Code launches on the endpoint, a code-execution boundary we documented in Anthropic's MCP stdio RCE by design, with the broader hardening picture in the MCP server security complete guide. What we look for in a fleet MCP inventory:

Where each server came from, registry, marketplace, or hand-written, and whether the same logical server resolves to different commands across endpoints.
Scope sprawl, user-scope servers that load in every repo the developer opens, versus project-scope servers the team reviewed.
Trust state, the per-project trust flags in ~/.claude.json that decide whether a server runs without re-prompting.
Tool exposure, MCP permissions use the mcp__<server>__<tool> shape, and a deny like mcp__* removes every MCP tool from context. We record what is actually reachable.

A multi-agent MCP setup also widens the blast radius for prompt injection, which we walked through in comment-and-control multi-agent prompt injection and credential theft. The inventory does not stop that attack, but you cannot reason about it until you know which servers are wired in where.

Plugins, skills, and hooks

Plugins bundle skills, agents, hooks, and MCP servers behind one install, and enabledPlugins toggles them per scope. Skills load from ~/.claude/skills/, the project .claude/skills/, and any --add-dir directory, with live reload, so a skill can appear mid-session without a restart. Hooks matter most to security teams: a PreToolUse hook runs before the permission prompt and can deny a tool call, force a prompt, or let it proceed, and a hook that exits non-zero blocks the call ahead of the allow rules.

Hooks cut both ways. A well-written PreToolUse hook is a legitimate enforcement point; the docs themselves recommend allowing Bash broadly and using a hook to reject specific commands. But a hook is an arbitrary shell command that fires on tool calls, and it can arrive through a plugin from a marketplace. The same mechanism that hardens one endpoint is, on another, an unreviewed script executing on every action. An audit has to enumerate hooks, where they came from (managed, user, project, or plugin), and what they run. This is the same project-file trust problem behind CVE-2025-59536, the Claude Code project-file RCE and token exfiltration.

Artifact	Loads from	What the audit answers
settings.json rules	user / project / local / managed merge	Effective merged allow/ask/deny per endpoint
MCP servers	.mcp.json, ~/.claude.json (local/user)	Source, scope, trust state, reachable tools
Plugins	enabledPlugins per scope, marketplaces	What each bundle silently enables
Skills	~/.claude/skills, .claude/skills, --add-dir	Auto-invocable instructions, including live-reloaded ones
Hooks	managed / user / project / plugin	What shell runs before each tool call

CLAUDE.md and auto memory

CLAUDE.md is the per-project instruction file loaded into context every session, and auto memory persists learnings across sessions, with the first 200 lines or 25KB of MEMORY.md loading at session start. Neither changes what Claude Code allows, the docs are clear that permission rules are enforced by the harness, not the model, so a prompt or CLAUDE.md cannot widen access. But both shape what the agent tries to do, and both are free-text files an attacker or careless teammate can edit.

In an audit, CLAUDE.md and MEMORY.md are the behavioral context layer: they will not bypass a deny rule, but they can steer an agent toward a tool call that an over-broad allow rule then waves through. We treat them as inventory items worth diffing, not as controls. The CI variant of this trust gap, where context determines who the agent acts as, is what we covered in the Claude Code GitHub Action bot-actor bypass.

Why no existing control parses this

Walk the stack. EDR sees a signed node or claude process making expected syscalls, trusted software doing trusted-software things. A network proxy or DLP tool sees TLS to an API endpoint and, for stdio MCP, no network at all, because the server is a local subprocess. Package scanners read checked-in manifests and miss the untracked local and user settings where drift lives. GRC tooling captures policy intent, not merged per-endpoint reality. Each control is correct within its own frame and blind to the same agent running a different policy, servers, and hooks on each laptop. The artifacts are plain files in ~/.claude and .claude/, but reading them is not enough; you have to reconcile the precedence merge to know what actually governs a tool call.

This is the layer Anomity makes visible. On every managed endpoint we inventory eight AI artifact types, AI agents, MCP servers, extensions, skills, plugins, secrets, hooks, and CLIs, and classify them, so the fleet inventory shows the merged settings, registry-sourced MCP servers, enabled plugins, and hooks as they actually resolve per machine. On agents that expose a hook like Claude Code's PreToolUse, Anomity returns allow, deny, or log on each tool call before it runs, and keeps a queryable 90-day audit trail routed to your SIEM, Slack, email, or Jira. It complements EDR, DLP, and GRC rather than replacing them. To see what your own fleet's Claude Code configuration looks like once it is reconciled, request early access.