← Back to blog AdvisoryMedium

Ollama Model Pull API SSRF - CVE-2026-5530

Anomity Research Anomity Threat Research · Apr 4, 2026 · 5 min read

LLM Gateways & Proxies·Medium·CVE-2026-5530·Apr 4, 2026

Affected Ollama through 0.18.1 (server/download.go Model Pull API); no vendor response at disclosure

CVE-2026-5530 is a server-side request forgery in Ollama's Model Pull API, rated Medium. Insufficient validation of a user-supplied registry URL lets an attacker force the server to make arbitrary outbound HTTP requests. The flaw lives in the server/download.go component and affects Ollama through version 0.18.1. The vendor did not respond when contacted, so at disclosure there was no official fix. This advisory covers what the bug exposes, why a local model runtime is an agentic-endpoint problem, and how Anomity surfaces and governs the agents that drive it.

What happened

Ollama is a runtime that pulls and serves models locally behind a small HTTP API. Its Model Pull API fetches model layers from a registry, and the request includes the registry the server should contact. That registry value is where the trust boundary breaks.

In the affected versions, the pull path in server/download.go does not sufficiently validate the user-supplied registry URL. An attacker can submit a crafted registry value that forces Ollama to issue an outbound HTTP request to a destination of the attacker's choosing. Because the server makes that request, the attacker inherits the server's network position rather than their own.

That position is the impact. The forged request can reach internal services that are not exposed to the internet, cloud instance metadata endpoints that may return short-lived credentials, and hosts inside isolated network segments. The same primitive can carry data outward through an injected registry URL, so a model-pull request becomes a path for bypassing network controls and exfiltrating from segments the attacker could not otherwise touch.

Exploitation is not theoretical. GreyNoise telemetry shows a dedicated campaign abusing Ollama's model-pull functionality to exploit the SSRF with malicious registry URLs, with activity spiking over the 2025 to 2026 holiday period. With no vendor response and no patch, operators are left to constrain the server themselves: restrict pulls to trusted registries, and block out-of-band DNS at the resolver layer.

Detail	Value
Identifier	CVE-2026-5530
Type	Server-side request forgery (user-supplied registry URL)
Severity	Medium
Component	server/download.go (Model Pull API)
Affected	Ollama through 0.18.1
Fixed in	No vendor response at disclosure; no official fix
Exploited	GreyNoise: active campaign, spiking over the 2025-2026 holiday period

Why this is an agentic-endpoint risk

A local model runtime rarely sits alone. Ollama exists because AI agents, CLIs, and developer tooling want a model endpoint they can call without leaving the host. On a managed endpoint, the Ollama process is an AI artifact in its own right, and so are the Claude Code sessions, MCP servers, and command-line agents that point at it. When an agent can trigger a model pull, it can reach the very API this SSRF abuses.

That reachability is the risk. The leak is driven by the request itself: an agent or a user passing a crafted registry URL through the pull API, whether by mistake, through a poisoned tool definition, or as a deliberate attack, can steer the server to an internal service or a metadata endpoint. Network and EDR controls see the outbound connection, but cannot tell you which agents on which endpoints run an affected Ollama build, or whether any were allowed to initiate a pull to an unapproved registry.

This is the same artifact-layer blind spot we track across the gateway cluster, including the sibling case in LiteLLM api_base SSRF - CVE-2024-6587 and LiteLLM pre-auth SQL injection - CVE-2026-42208. The runtime is one node in a graph of AI artifacts, and you can't govern what you can't see. Fleet-wide inventory of every AI artifact is the precondition for scoping an SSRF like this one.

How Anomity surfaces and governs it

Anomity inventories eight AI artifact types on every managed endpoint: AI agents, MCP servers, extensions, skills, plugins, secrets, hooks, and CLIs. For CVE-2026-5530 that means the Ollama process and its version are catalogued alongside the agents and CLIs that drive it, so you can answer "which endpoints run an affected Ollama build, and what talks to it" from the fleet inventory instead of guessing, even where the vendor offers no patch to track against.

On agents that expose a hook, such as Claude Code PreToolUse, Anomity returns allow, deny, or log on each tool call before it runs. That is the enforcement point in runtime governance: a tool call that triggers a model pull to a registry outside an approved allowlist can be denied or logged in line rather than discovered after the forged request has already reached an internal service. Anomity collects metadata only and redacts secrets on the endpoint, so any credentials the SSRF tried to reach never pass through Anomity.

Every decision is written to a queryable 90-day audit trail. After a disclosure like this, that trail is what lets responders scope the event: which agents initiated a pull, when, and what registry each call named. Anomity routes those decisions to SIEM, Slack, email, or Jira so the right team sees them in the tool they already use. The result is the timeline and the enforcement record described under outcomes.

Anomity complements your existing Network, EDR, DLP, and GRC controls rather than replacing them. It adds the agentic-endpoint layer those tools cannot see. See how it works and how Anomity compares for where it fits.

What to check across your fleet

Identify every endpoint and service running Ollama and record the exact version; treat anything at or below 0.18.1 as affected.
Restrict model pulls to a short allowlist of trusted registries so the Model Pull API cannot be steered to an arbitrary host.
Block out-of-band DNS resolution at the resolver layer to cut the channel attackers use to confirm and exfiltrate through the SSRF.
Restrict network reachability so Ollama is not exposed to untrusted callers, and require authentication in front of the runtime's API.
Deny the runtime egress to internal services, cloud instance metadata endpoints, and isolated segments it has no business reaching.
Review outbound connection and DNS logs for requests to metadata endpoints or unexpected internal hosts originating from the Ollama process.
Enumerate which AI agents, CLIs, and MCP servers can trigger a model pull through the affected runtime, using a fleet-wide AI artifact inventory.
Confirm hook-based allow/deny/log enforcement is active on agents that drive Ollama, so a pull to an unapproved registry can be blocked before it runs.

CVE-2026-5530 turns one model-pull request into a foothold for reaching internal services and metadata endpoints, with no vendor fix to wait for, which is exactly why the AI artifact layer needs its own inventory and enforcement. For the full cluster context, see the pillar on securing LLM gateways and proxies. To see Anomity inventory your agents, govern tool calls at the hook, and keep a 90-day audit trail, request early access.

Frequently asked questions

Is there an official fix for CVE-2026-5530 in Ollama?

At disclosure there was no vendor fix. The flaw affects Ollama through version 0.18.1, and the vendor did not respond when contacted, so operators were left without official mitigation guidance. That makes compensating controls the only practical defense for now. Restrict model pulls to a small set of trusted registries so the server cannot be steered to an arbitrary host, and block out-of-band DNS resolution at the resolver layer to cut the channel attackers use to confirm and exfiltrate through the SSRF. Treat any internet-reachable Ollama instance as exposed until you have constrained where its Model Pull API is allowed to connect.

What can an attacker actually reach through this SSRF?

Because the server makes the outbound request, the attacker inherits the server's network position. Insufficient validation of the user-supplied registry URL in server/download.go lets a crafted value force Ollama to send arbitrary outbound HTTP requests. That reaches internal services not exposed to the internet, cloud instance metadata endpoints that can return short-lived credentials, and hosts inside isolated network segments. The same primitive can carry data outward through an injected registry URL, turning a model-pull request into a path for bypassing network controls and exfiltrating from segments the attacker could not otherwise touch.

Is CVE-2026-5530 being exploited in the wild?

Yes. GreyNoise telemetry shows a dedicated campaign abusing Ollama's model-pull functionality to exploit the SSRF with malicious registry URLs, with activity spiking over the 2025 to 2026 holiday period. That pattern of opportunistic scanning against a widely deployed local-model runtime, combined with the absence of a vendor patch, means an exposed instance is a realistic target rather than a theoretical one. If you run Ollama anywhere reachable, assume the Model Pull API is being probed and prioritize the registry-allowlist and out-of-band DNS controls over waiting for an upstream release.

How does Anomity reduce exposure when a runtime like Ollama is at risk?

Anomity treats the local model runtime as an AI artifact on the endpoint, so it inventories the Ollama process, its version, and the agents and CLIs that drive it. On agents that expose a hook, such as Claude Code PreToolUse, Anomity returns allow, deny, or log on each tool call before it runs, so a call that triggers a model pull to an unapproved registry can be denied or logged in line. Every decision lands in a queryable 90-day audit trail, giving responders the timeline they need to scope an SSRF event across the fleet.