Critical Memory Leak and Unpatched RCE Flaws Plague Ollama

A critical vulnerability in **Ollama**, dubbed 'Bleeding Llama,' could allow unauthenticated attackers to leak sensitive process memory. Additionally, two unpatched remote code execution (RCE) flaws in Ollama's Windows update mechanism pose a significant risk, highlighting the need for immediate security measures.

2026-05-10T22:24:45 Critical Memory Leak and Unpatched RCE Flaws Plague Ollama

![Ollama Vulnerability](https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj92eUjjTTMJPizvUJGwq7Ych7nrXHwGRNt3hS9yjNGRJk5d3pdIKjeZhQDVuFp0DnKjP4qoieGWFjswm7nHDLBaxWC3DxFIfLfRjMSEXd0Ta04vcTrbCpS9PEXebUUbMBxBt0VOb-PKVk-7Cq0FjuMXl4VtKneb5a3ujCo872goPN22GBFFhReJtWsQJLK/s16000/oll.jpg) Cybersecurity researchers have uncovered a critical security vulnerability in **Ollama** that, if exploited, could allow a remote, unauthenticated attacker to leak the entire process memory. This out-of-bounds read flaw, potentially impacting over 300,000 servers globally, is tracked as **CVE-2026-7482** (CVSS score: 9.1) and has been codenamed **Bleeding Llama** by **Cyera**. **Ollama** is a popular open-source framework that enables the local execution of large language models (LLMs). The project boasts over 171,000 stars and has been forked over 16,100 times on GitHub. "Ollama before 0.17.1 contains a heap out-of-bounds read vulnerability in the GGUF model loader," according to the CVE description. "The /api/create endpoint accepts an attacker-supplied GGUF file in which the declared tensor offset and size exceed the file's actual length; during quantization in fs/ggml/gguf.go and server/quantization.go (WriteTo()), the server reads past the allocated heap buffer." **GGUF** (GPT-Generated Unified Format) is a file format for storing large language models for local loading and execution. The vulnerability stems from Ollama's use of the `unsafe` package when creating a model from a GGUF file, specifically in the `WriteTo()` function, bypassing memory safety guarantees. ### Attack Scenario A malicious actor can send a crafted GGUF file to an exposed Ollama server, setting the tensor's shape to an extremely large number to trigger the out-of-bounds heap read during model creation via the `/api/create` endpoint. Successful exploitation can leak sensitive data from the Ollama process memory. This leaked data may include environment variables, API keys, system prompts, and concurrent users' conversation data, which can then be exfiltrated by uploading the resulting model artifact through the `/api/push` endpoint to an attacker-controlled registry. The exploitation chain involves these steps: * Upload a crafted GGUF file with an inflated tensor shape to a network-accessible Ollama server using an HTTP POST request. * Use the `/api/create` endpoint to activate model creation, triggering the out-of-bounds read vulnerability. * Use the `/api/push` endpoint to exfiltrate data from the heap memory to an external server. "An attacker can learn basically anything about the organization from your AI inference — API keys, proprietary code, customer contracts, and much more," said **Cyera** security researcher Dor Attias. ![api](https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwC3ssbxShiYtGxS0JsLsXPZNi7Atqo7Kp7Le1nJRDTA8F69oR9CRuvm0jFe7LpKoj8_w1nZCRjfcXhcZVbfBwl98PNt_xUeAJvZWUlKm-3fxB6AgcvNLZ9C1qEyzvg9bXwbW7lTrFjlnfWkOmUEARlwwPhO231DqSRA2r4QrKud_BpmEk6IhO5ZvoT1FJ/s1600/api.png) "On top of that, engineers often connect Ollama to tools like Claude Code. In those cases, the impact is even higher -- all tool outputs flow to the Ollama server, get saved in the heap, and potentially end up in an attacker's hands." Users are advised to apply the latest fixes, limit network access, audit running instances for internet exposure, and isolate and secure them behind a firewall. Deploying an authentication proxy or API gateway in front of all Ollama instances is also recommended, as the REST API lacks built-in authentication. ### Two Unpatched Flaws in Ollama Lead to Persistent Code Execution Researchers at **Striga** have detailed two vulnerabilities in Ollama's Windows update mechanism that can be chained into persistent code execution. These shortcomings remain unpatched after disclosure on January 27, 2026, following a 90-day disclosure period. According to Bartłomiej "Bartek" Dmitruk, co-founder of **Striga**, the Windows desktop client auto-starts on login from the Windows Startup folder, listens on 127.0.0[.]1:11434, and periodically polls for updates in the background via the `/api/update` endpoint to run any pending updates on the next app start. The identified vulnerabilities relate to a path traversal and a missing signature check that, when combined with the on-login routine, can permit an attacker with the ability to influence update responses to execute arbitrary code at every login. The flaws are listed below: * **CVE-2026-42248** (CVSS score: 7.7) - A missing signature verification vulnerability that does not verify the update binary prior to installation, unlike its macOS version. * **CVE-2026-42249** (CVSS score: 7.7) - A path traversal vulnerability that stems from the fact that the Windows updater creates the local path for the installer's staging directory directly from HTTP response headers without sanitizing it. To exploit the flaws, the attacker needs to control an update server reachable by the victim's Ollama client. This could lead to a scenario where an arbitrary executable is supplied as part of the update process and gets written to the Windows Startup folder without raising any signature check issues. One approach involves overriding the OLLAMA_UPDATE_URL to point the client at a local server on plain HTTP. The attack chain also assumes AutoUpdateEnabled is on, which is the default setting. The missing integrity check can lead to code execution on its own without exploiting the path traversal vulnerability. In this case, the installer is dropped into the expected staging directory. During the next launch from the Startup folder, the update process is invoked without re-verifying the signature, causing the attacker's code to be executed instead. While this remote code execution is not persistent, as the next legitimate update overwrites the staged file, adding the path traversal allows a malicious actor to redirect the executable to be written outside the usual path, achieving persistent code execution. According to **CERT Polska**, which took over the coordinated disclosure process, Ollama for Windows versions 0.12.10 through 0.17.5 are vulnerable to the two flaws. In the interim, users are recommended to turn off automatic updates and remove any existing Ollama shortcut from the Startup folder ("%APPDATA%\Microsoft\Windows\Start Menu\Programs\Startup") to disable the silent on-login execution pathway. "Any Ollama for Windows installation running version 0.12.10 through 0.22.0 is vulnerable," Dmitruk said. "The path traversal writes attacker-chosen executables into the Windows Startup folder. The missing signature verification keeps them there: the post-write cleanup that would remove unsigned files on a working updater is a no-op on Windows. On the next login, Windows runs whatever was left behind." "The chain produces persistent, silent code execution at the privilege level of the user running Ollama. Realistic payloads include reverse shells, info-stealers exfiltrating browser secrets and SSH keys, or droppers that pivot to additional persistence mechanisms. Anything that runs as the current user. Removing the dropped binary from the Startup folder ends the persistence, but the underlying flaws remain."

📡 Intelligence Feed

Critical Memory Leak and Unpatched RCE Flaws Plague Ollama

✏️ Edit Article