Big Security Bet #3: AI redefines vulnerability management

Modern vulnerability management is overwhelmed. Security teams face an impossible equation: increased CVE volume (each year brings a new record), expanded software supply chains, growing dependency trees, limited engineering bandwidth, and crushing ticket fatigue from SLA-driven patching.

And the stakes keep rising. AI is dramatically shrinking the time between disclosure and exploitation; recent data suggests that 50-60% of newly disclosed vulnerabilities are weaponized within 48 hours.

Still, the vast majority of vulnerabilities are never exploited. Multiple industry studies suggest that fewer than 5% of vulnerabilities are ever weaponized in the wild. The real challenge isn’t discovering vulnerabilities — it’s identifying the small subset of them that actually matters.

Traditional models focus on raw scanner output, CVSS scores, static severity labels, and SLA tracking. But severity doesn’t equal risk. The primary goal needs to be prioritizing the right vulnerabilities, getting them fixed quickly, and reducing systemic recurrence.

Our vision: From vulnerability backlog to true risk reduction

At Box, we believe vulnerability management needs a fundamental shift.

For years, the industry has optimized for vulnerability throughput — scanning faster, patching faster, and closing more tickets. But this model treats vulnerabilities as an operational backlog to manage rather than an exposure problem to solve.

AI allows us to rethink that model. Instead of focusing on processing findings, we can focus on reducing the conditions that create risk in the first place.

We’re using AI to transform vulnerability management from a reporting system into a risk-reduction engine:

Reactive → Predictive
Volume-based → Risk-based
Ticket-driven → Intelligence-driven
Static scoring → Context-aware prioritization

AI becomes the connective layer that correlates vulnerabilities to real-world exploit activity, maps exposure data to our asset inventory, identifies mitigating controls, and guides remediation with precision.

The goal isn't "patch more" — it's "reduce measurable exposure."

This model also works in highly regulated environments. As a FedRAMP High authorized service provider, we operate under strict vulnerability management and patching requirements.

Context-aware prioritization doesn’t replace those controls — it strengthens them by ensuring remediation efforts focus first on vulnerabilities that create real exposure in our environment.

Here’s how we’re approaching it

Contextual risk instead of static severity

Traditional programs prioritize based on CVSS scores, but severity alone doesn't tell you:

Is this asset exposed to the internet?
Is it production-facing?
Is there active exploitation?
Does it create a meaningful attack path?

At Box, we use AI-powered automation to correlate asset criticality, exposure data, exploit intelligence, and real-world attacker behavior. For example, instead of just seeing a vulnerability in a library, we're looking at whether the vulnerable function is actually called, and whether the exploit path can reach it in our environment. Instead of treating everything as critical, we identify what actually increases exposure risk.

Turning prioritization into remediation

But prioritization isn't enough; AI also accelerates remediation.

We've built an agent that generates pull requests to remediate vulnerabilities automatically. When a vulnerable dependency or code issue is identified, the agent:

Pulls in reachability data
Evaluates the risk
Applies logic for auto-patching

Instead of filing a ticket and waiting, we deliver a structured solution that engineers can review and merge. Security becomes a remediation accelerator, not a reporting function.

Fixing vulnerabilities where developers work

We've also implemented AI-powered SAST tooling that suggests fixes directly within the developer workflow. Instead of finding vulnerabilities late in the lifecycle, developers get:

Real-time detection
Context-aware, low-risk fixes
Clear explanations of risk

By helping fix them before they reach production, AI reduces vulnerability volume at the source.

Reducing recurring exposure

AI-driven prioritization and remediation are powerful, but transformation also means reducing repeat findings. We complement AI with disciplined engineering practices like:

Secure-by-default hardened baselines
Continuous deployment of fresh, hardened base images
Recurring pattern identification
Frequent dependency refresh cycles

The goal isn't just faster patching — it's fewer vulnerabilities to patch over time.

What this changes operationally

The baseline is shrinking

Through stronger foundational engineering practices, the number of vulnerabilities entering our environment is declining. Rather than patching aging systems, we're replacing them with hardened builds and reducing recurring exposure patterns. Instead of managing an ever-expanding backlog, we can compress the attack surface.

Vulnerabilities are eliminated earlier

Vulnerabilities that would have historically shown up in scanner output are increasingly being addressed before they reach production. That reduces downstream remediation work, ticket volume, and escalations between security and engineering.

Prevention is reducing operational drag.

Urgency becomes meaningful again

By layering contextual risk into prioritization, the number of vulnerabilities that demand immediate action has decreased. We're no longer treating every "high severity" finding as equally urgent. Instead we're focusing on the subset that meaningfully increases exposure in our environment. This has two effects:

High-risk issues move faster
Engineering fatigue decreases

Noise is no longer competing with real risk.

We're measuring exposure, not activity

Here’s what success used to look like:

Tickets closed
SLAs met
Patch percentages achieved

Here’s what success looks like now:

Shorter exposure windows
Fewer exploitable attack paths
Reduced recurrence of the same vulnerability classes

The conversation is shifting from throughput to risk reduction.

The strategic shift

Without structural change, vulnerability management scales linearly: more assets → more vulnerabilities → more tickets → more fatigue.

What's changing for us is the trajectory:

Fewer vulnerabilities are introduced
Fewer require urgent action
More effort targets systemic elimination

That's not just acceleration. It's a shrinking attack surface.

Enterprise-grade AI security: What it takes to trust AI with your data

The enterprise trust challenge: Securing AI agents at scale

What's hard (and what we're working through)

Transforming vulnerability management this way introduces new challenges. Many of them sit at the intersection of data quality, developer experience, and operational stability.

Data quality is fragmented: Context-aware prioritization depends on accurate asset inventories, ownership data, and telemetry. In reality, those signals are often incomplete or inconsistent. Cloud infrastructure is ephemeral, services evolve quickly, and ownership changes frequently. AI can help correlate signals, but risk models are only as strong as the underlying data.

Risk scoring must be explainable to build trust: Building a sophisticated prioritization model is relatively easy. Getting engineers to trust it is much harder. If developers can’t understand why a vulnerability is being prioritized, they fall back to CVSS severity and ignore the model. Prioritization must be transparent, defensible, and grounded in observable signals.

Optimization can create blind spots: AI-driven prioritization tends to emphasize exploit likelihood and external exposure. But some of the most dangerous vulnerabilities are latent — privilege escalation paths, architectural weaknesses, or chained attack paths that only emerge in combination. Risk models must be continuously tested and tuned to avoid tunnel vision.

Automated remediation introduces operational risk: Automatically generating fixes can accelerate remediation, but it also introduces the possibility of breaking builds or destabilizing production systems. Guardrails like staged rollout, dependency awareness, and strong testing are essential.

Active remediation requires resilient foundations: Rapid remediation only works in environments with reliable testing, strong dependency management, and well-defined development practices. Without those foundations, automation amplifies chaos instead of reducing risk

Developer experience matters: It becomes very easy to replace tickets with PRs. For automation to work, fixes must be high-quality, low-risk, and easy to review and merge.

A shrinking attack surface

AI doesn't fix vulnerability management by patching faster. It fixes it by helping us reduce exposure in the first place.

By modeling risk more accurately, aligning remediation with real-world threats, and fixing vulnerabilities earlier in the development lifecycle, the goal becomes simpler:

Fewer vulnerabilities.
Shorter exposure windows.
Less noise competing with real risk.

Big Bet #1 changes how our SOC analysts spend their time. Big Bet #2 improves the quality of what reaches them. Now Big Bet #3 reduces the underlying exposure they’re responding to.

Together, they compound: better detections surface real risk, smarter prioritization fixes the right issues, and fewer vulnerabilities reduce the operational burden across the program.

But vulnerability management is only part of the equation.

To truly shrink the attack surface, we also need to evolve how systems are designed and how they’re tested.

In the next two posts, I’ll explore how AI can help scale security architecture and design reviews and enable continuous, AI-assisted penetration testing.

Because ultimately, the goal isn’t just faster detection or remediation, it’s building systems with less risk to begin with.