Auditing Claude Code Across a Fleet: Settings, Hooks, MCP Servers, Plugins, and Skills
When we inventory Claude Code across a fleet of managed laptops, the most interesting findings are not in any one engineer's setup. They are in the spread. Two developers on the same team, in the same repo, run the same agent with materially different permission rules, MCP servers, and hooks, because Claude Code reads configuration from several layers and merges them per directory. The result is a per-endpoint security posture that no inventory built for browsers or packages was designed to see.
This is a research-findings piece, not an advisory. We are documenting the recurring config patterns we observe and what each tells you about real risk. We covered why this layer needs inventorying at all in securing AI coding agents and CLIs; here we go deeper into the specific Claude Code artifacts and where they drift. Every behavior below is documented in Claude Code's own how-it-works and permissions references.
Where the configuration actually lives
Claude Code resolves settings through a precedence chain. From lowest to highest authority: user settings in ~/.claude/settings.json, shared project settings in .claude/settings.json, local project settings in .claude/settings.local.json, command-line arguments, and finally managed (enterprise) settings no lower level can override. Project-scoped MCP servers live in .mcp.json; user and local MCP scopes sit in ~/.claude.json with per-project trust state. Auditing one laptop means reconciling four or five files, and the effective policy is whatever wins the merge.
The practitioner consequence: reading .claude/settings.json out of a Git repo tells you what the team intended, not what runs. The local and user files, both untracked, are where drift accumulates, and a package scanner reading checked-in config misses the policy that actually governs tool calls on the endpoint.
settings.json allow/deny drift
The permission system is a tiered allow / ask / deny model, evaluated deny-first: deny rules are checked before ask, then allow, and rule specificity does not change that order. A deny at any scope blocks the call regardless of an allow elsewhere. So the audit question is never just "what is allowed" but "what is the merged deny set, and did a local file quietly widen the allow list past it?" The patterns we see most often:
- Bare
Bashon the allow list. A bare tool name allows every shell command. We see it added to.claude/settings.local.jsonto stop the prompts, turning the most powerful tool into a no-prompt path. Argument-constraining patterns like aBash(curl ...)rule scoped to one domain are fragile, options before the URL, a different protocol, a redirect, or a shell variable all slip past. - Process-wrapper and runner gaps. Claude Code strips wrappers like
timeoutandnicebefore matching, but environment runners such asnpx,docker exec, anddevbox runare not, soBash(devbox run *)matches whatever followsrun, including a destructive command. - Read/Edit deny rules that look protective but are not absolute.
Read(.env)-style deny rules cover Claude's file tools and recognized Bash commands likecatandsed, but not a Python or Node script that opens the file itself, defense-in-depth, not a boundary. defaultModeset tobypassPermissionsoracceptEdits. A user file flipping the default mode changes the safety story for every session on that machine.
None of this is exotic. It is the ordinary entropy of developers removing friction one prompt at a time. The point of the audit is to make the merged result legible: which endpoints widened past the team baseline, and on which tool.
MCP servers from registries
MCP servers are where the inventory gets genuinely external. A server can be added at local, project, or user scope; project servers ship in .mcp.json and travel with the repo, user-scope servers follow the developer across every project. Many are pulled from public registries and marketplaces, and a stdio server is just a command Claude Code launches on the endpoint, a code-execution boundary we documented in Anthropic's MCP stdio RCE by design, with the broader hardening picture in the MCP server security complete guide. What we look for in a fleet MCP inventory:
- Where each server came from, registry, marketplace, or hand-written, and whether the same logical server resolves to different commands across endpoints.
- Scope sprawl, user-scope servers that load in every repo the developer opens, versus project-scope servers the team reviewed.
- Trust state, the per-project trust flags in
~/.claude.jsonthat decide whether a server runs without re-prompting. - Tool exposure, MCP permissions use the
mcp__<server>__<tool>shape, and a deny likemcp__*removes every MCP tool from context. We record what is actually reachable.
A multi-agent MCP setup also widens the blast radius for prompt injection, which we walked through in comment-and-control multi-agent prompt injection and credential theft. The inventory does not stop that attack, but you cannot reason about it until you know which servers are wired in where.
Plugins, skills, and hooks
Plugins bundle skills, agents, hooks, and MCP servers behind one install, and enabledPlugins toggles them per scope. Skills load from ~/.claude/skills/, the project .claude/skills/, and any --add-dir directory, with live reload, so a skill can appear mid-session without a restart. Hooks matter most to security teams: a PreToolUse hook runs before the permission prompt and can deny a tool call, force a prompt, or let it proceed, and a hook that exits non-zero blocks the call ahead of the allow rules.
Hooks cut both ways. A well-written PreToolUse hook is a legitimate enforcement point; the docs themselves recommend allowing Bash broadly and using a hook to reject specific commands. But a hook is an arbitrary shell command that fires on tool calls, and it can arrive through a plugin from a marketplace. The same mechanism that hardens one endpoint is, on another, an unreviewed script executing on every action. An audit has to enumerate hooks, where they came from (managed, user, project, or plugin), and what they run. This is the same project-file trust problem behind CVE-2025-59536, the Claude Code project-file RCE and token exfiltration.
| Artifact | Loads from | What the audit answers |
|---|---|---|
| settings.json rules | user / project / local / managed merge | Effective merged allow/ask/deny per endpoint |
| MCP servers | .mcp.json, ~/.claude.json (local/user) | Source, scope, trust state, reachable tools |
| Plugins | enabledPlugins per scope, marketplaces | What each bundle silently enables |
| Skills | ~/.claude/skills, .claude/skills, --add-dir | Auto-invocable instructions, including live-reloaded ones |
| Hooks | managed / user / project / plugin | What shell runs before each tool call |
CLAUDE.md and auto memory
CLAUDE.md is the per-project instruction file loaded into context every session, and auto memory persists learnings across sessions, with the first 200 lines or 25KB of MEMORY.md loading at session start. Neither changes what Claude Code allows, the docs are clear that permission rules are enforced by the harness, not the model, so a prompt or CLAUDE.md cannot widen access. But both shape what the agent tries to do, and both are free-text files an attacker or careless teammate can edit.
In an audit, CLAUDE.md and MEMORY.md are the behavioral context layer: they will not bypass a deny rule, but they can steer an agent toward a tool call that an over-broad allow rule then waves through. We treat them as inventory items worth diffing, not as controls. The CI variant of this trust gap, where context determines who the agent acts as, is what we covered in the Claude Code GitHub Action bot-actor bypass.
Why no existing control parses this
Walk the stack. EDR sees a signed node or claude process making expected syscalls, trusted software doing trusted-software things. A network proxy or DLP tool sees TLS to an API endpoint and, for stdio MCP, no network at all, because the server is a local subprocess. Package scanners read checked-in manifests and miss the untracked local and user settings where drift lives. GRC tooling captures policy intent, not merged per-endpoint reality. Each control is correct within its own frame and blind to the same agent running a different policy, servers, and hooks on each laptop. The artifacts are plain files in ~/.claude and .claude/, but reading them is not enough; you have to reconcile the precedence merge to know what actually governs a tool call.
This is the layer Anomity makes visible. On every managed endpoint we inventory eight AI artifact types, AI agents, MCP servers, extensions, skills, plugins, secrets, hooks, and CLIs, and classify them, so the fleet inventory shows the merged settings, registry-sourced MCP servers, enabled plugins, and hooks as they actually resolve per machine. On agents that expose a hook like Claude Code's PreToolUse, Anomity returns allow, deny, or log on each tool call before it runs, and keeps a queryable 90-day audit trail routed to your SIEM, Slack, email, or Jira. It complements EDR, DLP, and GRC rather than replacing them. To see what your own fleet's Claude Code configuration looks like once it is reconciled, request early access.