AI Agent Skills: A New Vector for Supply Chain Attacks?
A recent experiment by security firm **AIR** has exposed critical vulnerabilities in how AI agent 'skills' are vetted and distributed, demonstrating how easily malicious code could bypass current security scanners. By leveraging trusted platforms and dynamic external links, **AIR** successfully deployed a seemingly innocuous skill that could have been weaponized to compromise thousands of AI agents, including those on corporate networks.
Security firm **AIR** recently conducted a revealing experiment, constructing a deceptive AI agent skill and pushing it through a popular skill marketplace and an Instagram ad campaign. The firm claims this proof-of-concept reached approximately 26,000 agents, some of which were on corporate accounts.
Crucially, every security scanner tested against **AIR**'s fake skill flagged it as safe. The payload itself was designed to be harmless, merely collecting the user's email address. The objective was to highlight that common trust signalsβscanners, **GitHub** stars, and open-source reputationβfailed to detect the underlying threat.
An AI 'skill' functions as a set of instructions an agent loads into its context, executing them with a level of authority comparable to a user prompt. This inherent trust is precisely what makes them a potential vulnerability, necessitating the very skill-scanning tools that ultimately proved insufficient.
### The Deceptive Skill: 'brand-landingpage'
The fake skill, named **brand-landingpage**, purported to build landing pages using **Google**'s **Stitch** design tool, targeting non-technical users. To lend it credibility, **AIR** manipulated two key trust signals: **GitHub** stars and a clean scanner verdict.
**AIR** submitted a pull request to a skill marketplace repository boasting around 36,000 stars and 156 existing skills. Upon merging, the **brand-landingpage** skill inherited the repository's star count, artificially boosting its perceived legitimacy. An Instagram ad campaign, aimed at marketers, salespeople, and designers, then drove installations.
### Why Scanners Missed the Threat
The scanners **AIR** testedβincluding those from **Cisco AI Defense**, **NVIDIA Skillspector**, and those integrated into skills.shβprimarily analyze the static package provided: the `SKILL.md` file and accompanying files.

**AIR**'s skill contained no setup instructions within its package. Instead, it directed the agent to install the "Stitch SDK" by following documentation at an external link: `stitch-design.ai`. This domain was controlled by **AIR**, not **Google** (the legitimate **Stitch** service is found at `stitch.withgoogle.com`).
Initially, this external link pointed to genuine **Stitch** documentation. Seeing a clean package that referenced a plausible setup page, the scanners cleared the skill. The crucial elementβthe page the agent would eventually fetch and execute instructions fromβremained outside the scope of the initial scan.

Once the skill was widely installed, **AIR** altered the content behind the `stitch-design.ai` link. The updated page instructed the agent to download and run a script. In this controlled demonstration, the script only emailed the user's address back to **AIR**, allowing them to count installations. However, a malicious actor could have used this foothold to read files, exfiltrate data, or access internal systems, limited only by the agent's permissions.
This isn't an isolated finding. Just weeks prior, **Trail of Bits** demonstrated a similar bypass of **ClawHub**'s malicious-skill detector, **Cisco**'s scanner, and three scanners on skills.sh. Their conclusion was stark: scanners review a static package, while attackers can continually modify dynamic payloads until they evade detection.
Real-world campaigns have been employing this tactic for months, submitting clean skills while hosting the actual malicious payload on external sites fetched by the agent at installation.

The core issue is structural: the initial scan is a one-time event, but the content of an external page a skill points to can be changed at any time afterward. **Anthropic**'s own documentation for **Claude** agents and tools explicitly warns that skills fetching external URLs carry inherent risks due to post-vetting content changes. Further research this year has also highlighted inconsistencies among scanners, often because they evaluate skills in isolation, ignoring external links and potential post-review modifications.
### Mitigating the Risk
For defenders, the message is clear and reinforced by this experiment: treat AI agent skills as full-fledged software, not mere text. The vetting process must extend beyond the submitted package to include what a skill points to externally.
The immediate priority is to identify what skills are already running, as many are installed without rigorous review. Organizations should route all new skill installations through a controlled, centralized source and implement continuous re-verification, particularly when external dependencies change. A clean scan at installation offers no lasting guarantee if the skill communicates with a link controlled by a third party.
Key recommendations include:
* **Pin versions:** Ensure agents use specific, immutable versions of external resources.
* **Least Privilege:** Configure agents with the minimum necessary permissions.
* **Assume Compromise:** Operate under the assumption that any external instruction an agent fetches could potentially run with the agent's full access.
While **AIR**'s reported figuresβ26,000 agents, including corporate accounts, and the potential for full controlβwarrant a skeptical read given the firm's commercial interest in launching a managed skill marketplace, the fundamental methodology is sound. The experiment doesn't uncover a new bug but rather orchestrates a confluence of known weak trust signals: borrowable **GitHub** stars, snapshot-based scans, and mutable external links. Whether the actual reach was 26,000 or a fraction thereof, the gaping security flaw it exposes remains a critical challenge for defenders in the evolving landscape of AI agents.