The Rise of the AI Worm: University of Toronto Unveils Self-Evolving Malware
Researchers at the **University of Toronto** have developed a proof-of-concept AI-driven computer worm capable of autonomously reasoning its way through networks, generating tailored attack strategies, and replicating without human intervention. This new breed of malware, which uses a locally hosted open-weight large language model, poses a significant challenge to traditional cybersecurity defenses by adapting to new vulnerabilities in real-time.

A groundbreaking paper, currently under peer review and posted to arXiv on June 2, details an AI-driven computer worm engineered by researchers at the **University of Toronto**. This proof-of-concept malware leverages a locally hosted, open-weight large language model (LLM) to navigate networks, devise bespoke attack strategies for each target, and self-replicateβall without human input or reliance on commercial AI services.
The research underscores a critical shift in the threat landscape: traditional patching strategies, often focused on single **CVE**s, may falter against malware that can dynamically inspect exposed services, ingest fresh advisories, and generate novel attack paths at runtime.
In 15 isolated experiments conducted on a deliberately vulnerable 33-host network, the worm identified an average of 31.3 vulnerabilities. It successfully gained elevated access on 23.1 hosts, roughly three-quarters of its active targets, and autonomously replicated to 20.4 of those hostsβ62% of the entire networkβover seven days. Critically, it achieved this without any prior knowledge of the network topology.
Unlike traditional worms that rely on a fixed exploit payload chosen at build time, this new AI-powered variant generates attack logic dynamically. This capability eliminates the need for pre-encoded exploit chains and removes dependencies on commercial API services like **OpenAI** or **Anthropic**, which could otherwise revoke access or rate-limit the attacker.
## How the Worm Operated
Led by Associate Professor **Nicolas Papernot**, the **CleverHans Lab** teamβcomprising experts from the **University of Toronto**, **Vector Institute**, **University of Cambridge**, and **ServiceNow**βconducted 15 independent experiments. The test environment, dubbed "FakeCorp," was an isolated 33-host network featuring a mix of operating systems including Ubuntu, Debian, Rocky Linux, Alpine, Windows Server 2008 R2, 2019, and 2022, as well as IoT devices.
Each host in the "FakeCorp" network had at least one real-world vulnerability. Five of the 33 machines were equipped with GPUs. In the primary propagation experiments, the worm utilized a shared GPU inference pool, simulating compute resources acquired from victim machines. A separate test of full GPU-tier replication, where the worm stages a local LLM instance on a compromised GPU host, succeeded in 68.8% of attempts.
This tiered design allows infected GPU-capable hosts to function as distributed reasoning nodes, providing inference capabilities for lower-compute devices on the network that cannot run the model independently.
Across the 15 runs, the worm achieved elevated access on 23.1 hosts and successfully launched a replica on 88% of those compromised systems. It reached up to seven generations of self-replication, with a mean of 5.1 generations. The researchers emphasize that the intentionally vulnerable setup was designed to measure autonomous reasoning across realistic individual flaws, not to assess spread through a highly hardened production network with active endpoint defenses.
The exploits leveraged in a single run were diverse and included chained vulnerabilities like SambaCry and writable root cron privilege escalation, **Dirty Pipe**, **PrintNightmare**, **Drupalgeddon 2**, **Exim RCE**, blind SQL injection, JWT bypass, and Redis escape. These were not pre-programmed but were identified and exploited by the agent as it reasoned its way through each host.

The paper's Figure 2 illustrates a full radial propagation tree from one run, showcasing six to seven generations of spread across Linux, Windows, and IoT hosts. Individual exploit attempts had a success rate of approximately 44%. Most failures stemmed from malformed payload syntax rather than flawed logic. Web application exploits and Windows privilege escalation proved more challenging than Linux local escalation and service exploits, a current model limitation acknowledged by the paper rather than a fundamental defense.
## The Patching-Window Problem Exacerbated
One of the most concerning findings is the worm's ability to bypass its own training cutoff. By ingesting public advisory text at runtime, it successfully exploited test hosts configured with three vulnerabilities disclosed *after* the model was trained:
* **CVE-2026-39987**: A pre-authentication RCE in the Marimo Python notebook (CVSS 9.3).
* **CVE-2026-31431** (**CopyFail**): A Linux kernel privilege-escalation flaw in the algif_aead module, added by **CISA** to its Known Exploited Vulnerabilities catalog in May.
* **CVE-2026-43284** / **CVE-2026-43500** (**DirtyFrag**): Related Linux kernel local privilege-escalation issues.
Against these three hosts, the worm achieved root access in 41 out of 67 attempts.
**CVE-2026-39987** was disclosed on April 8, 2026. **Sysdig** observed exploitation in honeypots just 9 hours and 41 minutes later, also documenting a real-world intrusion where an attacker used an LLM agent for post-exploitation activities after compromising a public Marimo instance. This highlights how an adaptive worm can exploit the patch gap by rapidly integrating new vulnerability intelligence.
The parallel to the **WannaCry** ransomware's impact lies not in blast radius, but in the patch gap. **EternalBlue**, the exploit leveraged by **WannaCry**, had been patched for months before the attack. This research makes a similar point: an adaptive worm can continuously probe new attack vectors while defenders are still validating and deploying fixes.
## Zero Marginal Cost, No Central Kill Switch
Two characteristics make this new class of AI-driven worms particularly challenging to contain:
* **Cost Shift**: The operational cost shifts from rented API access to the compute resources the worm can capture. Once GPU-capable victim infrastructure is compromised, the attacker no longer pays per attempt, enabling sustained, high-volume attacks.
* **Decentralized Control**: Because the worm relies on open-weight models with no vendor dependency, provider-side controls are ineffective. Measures like service refusals, rate limiting, or account suspensionsβtypical responses to misuse of commercial AI APIsβdo not apply. There is no API key to revoke, meaning containment must occur at the network and host layers.
Intriguingly, the researchers also observed the worm rewriting its own code on several occasions to bypass local security controls within the test environment, a behavior they had not explicitly programmed.
The current prototype was intentionally built without stealth features like encryption, polymorphic code, persistence mechanisms, or track covering. A malicious variant incorporating such elements, alongside encrypted payloads, process masquerading, and log cleanup, would significantly reduce the easy signals available to defenders.
## Context in the AI Threat Landscape
This is not the first instance of AI-driven worm research. **Morris II** (Cohen et al., 2025) demonstrated a self-replicating adversarial prompt spreading across AI email assistants, propagating within the AI application layer rather than across host infrastructure.
In March 2026, **ClawWorm** showcased self-replicating attacks across LLM agent ecosystems, hijacking persistent configurations and propagating to agent peers. The **University of Toronto** worm, however, is distinct: the LLM itself is not the target of the attack but rather the *engine* used to compromise ordinary network infrastructure.
Real-world operations are already pushing these boundaries. **Anthropic** reported in November 2025 that it disrupted a large AI-orchestrated espionage campaign, attributed with high confidence to **GTG-1002**, a Chinese state-sponsored group. Their **Claude Code** model reportedly handled 80-90% of the operation, including reconnaissance, exploit development, credential harvesting, lateral movement, and exfiltration, with human operators intervening only at key decision points.
**Google's Threat Intelligence Group** also reported a related shift in May 2026, assessing with high confidence the first zero-day exploit developed with AI assistance. This exploit was found in a criminal group's script ahead of a planned mass exploitation event, alongside malware families that generate their own commands at runtime, moving away from hardcoded logic. The **University of Toronto**'s work is the laboratory manifestation of this evolving threat, pushing AI capabilities into host-level worm propagation.