AI agents are creating a new supply chain crisis. We have a narrow window to get it right.

|
Share

When I wrote about Project Glasswing and Mythos Preview a few weeks ago, I argued that the hard problem for CISOs wasn't going to be whether AI could find vulnerabilities, but what happens to the rest of the system once it does.

The window between vulnerability disclosure and weaponization is collapsing, and the open-source supply chain is feeling it directly:

  • Trivy compromised in February, attackers back inside a month later harvesting credentials they used to poison LiteLLM (95 million monthly downloads).
  • Axios hijacked with cross-platform malware shipped to 100 million weekly downloads.
  • Shai-Hulud, the first self-replicating worm in npm, exposing 25,000+ private repositories across two waves.
  • The TeamPCP campaign moving through five separate ecosystems in a single week.
  • Vercel, a well-run commercial steward of critical OSS, compromised because one employee granted "Allow All" OAuth permissions to a third-party AI tool.

These aren't separate incidents. They're one story: an open-source ecosystem under unprecedented pressure, with attackers operationalizing AI-assisted discovery faster than maintainers and defenders can patch. HackerOne paused its Internet Bug Bounty program in March because the asymmetry between AI-driven discovery and human-paced remediation had broken the model. Their own framing: the bottleneck used to be discovery; now it’s remediation.

This is the existential crisis that most CISOs have at least begun tracking.

Heather CTA

The part that's getting less attention: while we're losing ground on the supply chain crisis we already have, an entirely new one is forming alongside it. We recently got a preview of what that looks like.

On April 15, OX Security published what they called "The Mother of All AI Supply Chains" — a systemic architectural vulnerability in Anthropic's official Model Context Protocol SDKs. It affects 200,000 servers and 150 million downloads across popular projects like LiteLLM, LangChain, and IBM's LangFlow, and produced fourteen CVEs, most rated critical. Researchers successfully executed commands on six live production platforms and demonstrated zero-click prompt injection in AI-integrated development environments like Windsurf and Cursor. Along with all of that, they submitted a proof-of-concept malicious entry to eleven MCP marketplaces. Nine accepted it.

Anthropic has characterized the behavior as expected within the protocol's current design, and individual vendors have issued patches. Reasonable people can disagree about whether the protocol-layer flaw is a bug or an expected consequence of design choices. What's not up for debate is what the marketplace test exposed: most of the distribution channels for AI-native dependencies don't have working vetting controls. A test payload made it through nine of eleven gates.

The new dependency ecosystem that most enterprises are already plugged into doesn't have working supply chain controls. And it's growing fast.

A new kind of dependency

I'm talking about AI agent skills, plugins, MCP servers, and the marketplaces distributing them.

Skills are small packages of instructions that teach an agent how to do a specific task. Anthropic published an open standard for them last December and OpenAI adopted the same format shortly after. Plugins bundle skills with agent definitions and configurations. MCP (Model Context Protocol) is how agents connect to and act on external services like calendars, databases, code repositories, and internal tools. Finally, each of these services gets distributed through marketplaces. Anthropic runs one. GitHub runs one. And third-party marketplaces are multiplying; one claims to index 900,000 skills automatically scraped from public repositories.

Four dynamics make this ecosystem behave differently from anything in your existing supply chain stack, and Vercel and last week's OX disclosure are early examples of each one playing out in the wild.

  1. Permissions inheritance. A malicious open-source library runs with whatever access is possessed by the program it's part of. A skill runs inside an AI agent that already holds your users' permissions — their email, their calendar, their files, their access to internal systems. Vercel is the pattern in miniature: an employee granted sweeping permissions to a third-party AI tool, and an attacker rode that grant into the enterprise. The same risk applies to any skill or plugin operating inside an agent with broad credentials.
  2. MCP servers are a privileged third party. An MCP server doesn't just extend what an agent knows; it gives the agent the ability to act in an external system. That's a much bigger trust decision than installing a library. When your agent can create calendar events, run database queries, or open pull requests via an MCP server someone else operates, you've effectively extended your trust boundary to that vendor's infrastructure. Traditional vendor due diligence won't get you there. A questionnaire about the vendor's security controls doesn't surface the kind of risk the OX disclosure exposed: an architectural flaw at the protocol layer, propagated through every SDK and distributed by marketplaces with no effective vetting. MCP servers need a different kind of evaluation — runtime behavior, upstream dependencies, default permissions.
  3. Dynamic loading. Skills and MCP connections get loaded on the fly by the agent, based on what it decides it needs. That's a design strength, but it means most of the supply chain security tooling enterprises have already invested in doesn't see any of this.
  4. Distribution velocity. Some marketplaces curate, others mirror. A marketplace that automatically syncs 900,000 entries from public repositories has a very different risk profile from one that vets publishers and requires verification, and right now the scraped ones are winning on volume. The OX researchers proved that nine of eleven MCP marketplaces will accept unvetted submissions. There's no reason to assume skill marketplaces are better.

When a five-month-old protocol ships with insecure defaults, and the SDK propagates the flaw across every supported language, and 200,000 servers inherit it, and nine of eleven distribution channels accept malicious entries without flinching — the timeline from "ecosystem launches" to "ecosystem has a Mother-of-All-Supply-Chains advisory" compresses to almost nothing. We don't have years to figure out governance the way we did with npm. The patterns are already being set, and attackers with AI tooling are watching the new ecosystem with the same interest they've trained on the old one.

The chance we have now and how we can seize it

Here’s the good news: the open-source community has already learned, expensively, what governance looks like when it works. And we can bring this playbook — signing, verification, inventory, coordinated disclosure, two-factor on publisher accounts, funded stewardship — to a new ecosystem from the start, instead of retrofitting it after a decade of incidents.

What that looks like in practice:

  • Curation over aggregation. Prefer marketplaces that vet publishers and verify entries. Be clear-eyed about the difference between a curated catalog and a scraped index. The OX proof-of-concept is a direct argument for preferring the former.
  • Capability-based permissions. Push vendors toward models where skills and MCP servers declare what they need access to, and users grant permissions per-connection rather than inheriting the agent's full set.
  • Sandboxed execution environments. Even a vetted skill running with properly-scoped permissions can do unexpected things at runtime; a single prompt injection can turn a trusted tool into an attack path. Agents should run in environments with explicit constraints at the system layer: containerized execution, controlled network egress, filesystem isolation, ephemeral credentials where persistent ones aren't required. This is defense in depth applied to the agent runtime.
  • Signing and verification as a baseline. Cryptographic signing by publishers and verification at load time should be a precondition of using a skill or MCP server in your environment, just as you'd require of any other software your enterprise depends on. The fact that this isn't already standard in five-month-old marketplaces is exactly why it has to be a procurement requirement now.
  • Runtime visibility. Anything loaded dynamically by an agent should be as visible to security tooling as any other running process. That means visibility into every skill loaded, every MCP server invoked, every tool call made, every action taken — and ideally the reasoning chain that led to each action.
  • Funded stewardship from day one. Open source taught us that the security work a dependency creates doesn't get done if no one's paying for it. The same will apply here. Make sure your security budget reflects the people, tooling, and time it takes to inventory, evaluate, and monitor every skill, plugin, and MCP server your organization depends on. This is real work, and it doesn't fit squarely into someone's existing job description.

It’s worth bearing in mind that this ecosystem is only five months old. The standards aren't locked in. The governance model is being written in real time, and security leaders who engage early have real influence over how it gets built. That's a meaningfully different position than the one we inherited with npm. Here are five ways we can seize the moment, roughly in order of urgency:

  • Inventory everything. Every skill, plugin, and MCP server enabled across your organization — across Claude, ChatGPT, Copilot, Cursor, Windsurf, and every other AI tool your developers have installed locally, plus everything your own teams have built and deployed internally. Many organizations have no idea what's running in their agent runtime right now. Start there.
  • Patch and review your MCP exposure specifically. OX disclosed fourteen CVEs across LiteLLM, LangChain, LangFlow, Flowise, and others. If any of those are in your stack, you should be triaging. If you don't know whether they're in your stack, that's the inventory problem above.
  • Require marketplaces to be on an approved list, not opt-out. After OX demonstrated that nine of eleven MCP marketplaces accept unvetted entries, "trust but verify" isn't a defensible default. Treat agent marketplaces like any other software supply channel — explicit allowlist, documented vetting, periodic review.
  • Put real enforcement between your agents and what they touch. Direct agent-to-MCP-server connections give security teams nowhere to apply policy or detect compromise. The right architecture for closing that gap is still being worked out. Gateways were the early instinct — proxy traffic between agents and the services they call, enforce policy at network egress. Runtime enforcement takes a different approach: policy that lives inside the agent itself, gating actions at the point of execution rather than at the network layer, catching what gateways can't. Most enterprises will probably need some combination. What matters is thatsomethingsits between intent and action.
  • Treat every permission your agents grant to a third party the way you treat third-party vendor permissions everywhere else. Which, after Vercel, should be much more skeptically than most companies do today. A skill granted full agent credentials is, in practice, a vendor with authenticated access to your environment.

And longer-term: engage with the vendors shaping this ecosystem. It’s much easier to influence a standard that's five months old than one that's twenty years old.

Heather CTA

Where this leaves us

The OX disclosure is going to fade from the news cycle in a couple of weeks, but the story it's part of isn't going anywhere.

A new dependency ecosystem is forming and it's growing fast. It's already producing the same supply chain failures we spent twenty years learning to address in open source, compressed into months rather than years, in a threat environment with AI-accelerated discovery on the offensive side. And the governance work that has to happen to keep it from becoming a chronic security crisis is happening right now, in product roadmaps and standards bodies and procurement conversations.

The window for security leaders to shape this is narrower than it looks, but it's also wider open than it'll ever be again.