This article is the second part of a series on how Box executes as an AI-first company. See our first article here.
Only 5% of AI pilots succeed. At Box, we’re learning how to beat those odds.
After a year of broad experimentation across more than 100 agents, we’ve developed a four-stage formula—Ideation, Pilots, Rollout Preparation, and Scaled Adoption—to go from ideas to effective, impactful agents embedded into workflows across our business.
In part one of our AI-first blog series, you learned why establishing AI principles and governance comes first, and how to prioritize a few “big bets” for maximum business impact. In part two, we’ll take a deeper look at how the big bets process unfolds, from initial experimentation to full-scale adoption.
The Four-Phase Deployment Framework
Box’s agentic journey—from distributed experimentation to focused execution—follows four phases:

Phase 1: Ideation
Identify AI opportunities by examining internal pain points and inefficiencies, and select which big bets to pilot. Use bottom-up ideation (hackathons, idea submissions, organic creation in demo accounts, etc.), and prioritize high-repeatability, critical thinking opportunities using the 2x2 framework discussed in the first article in this series.
Phase 2: Pilots
Build and test a prioritized set of agents with a limited scope and set of users over 3-6 months. Validate performance against defined metrics (efficiency, automation, and net new work), test user experience, and iterate based on instrumented metrics.
Phase 3: Rollout
Transform successful pilots into production-ready enterprise-wide solutions, requiring reliability thresholds, accurate knowledge management and data pipelines (serving as inputs into agents), governance and safety protocols, and clear documentation.
Phase 4: Scaled Adoption
Maximize adoption through change management, including workflow redesign, comprehensive training programs, clear usage goals and targets, and adoption tracking to drive accountability. Begin tracking and measuring full impact, and compare results to anticipated efficiency, automation, or net new work done.
Ready to identify your AI transformation priorities?
Phase 1: Ideation—Many New Ideas
Box built an environment that fostered experimentation, allowing anyone to build any agent they wanted. “The thing that we did really, really well out of the gate is that we let people experiment,” explains Box CIO Ravi Malick. “We put policy guidelines in place, not technical guidelines. We didn't block everything.”
We put policy guidelines in place, not technical guidelines.
Box began ideation with a two-pronged approach:
1. Broad, rather than centralized, access to the Box AI Studio in a sandbox environment: This allowed technically curious employees to experiment immediately. Teams received Box AI Studio accounts and minimal structure. If someone wanted to build an agent, they were provided the tools to do so.
2. Idea submission by anyone: For those who didn’t have time to experiment but had great ideas, Box created a simple idea submission process. This ensured valuable use cases weren’t missed simply because someone lacked the bandwidth to build them.
The Strategic Rationale for Broad Access
Teams closest to work understand their pain points better than executives planning from abstraction.
“The spark of innovation was with the team and with the individuals,” Nora Soza, Senior Director of GTM Strategy and Operations recounts. “You can’t undercut how valuable that is, because they’re the ones who are living and breathing the job and are able to give the most realistic view of how AI can transform their work.”
Still, without empirical evidence, centralized planning can’t predict where AI will deliver value. The experimentation phase helped generate that evidence—revealing:
- Which workflows AI naturally enhances versus where it struggles
- Which teams embrace AI versus which are more passive
Among those embracing AI were technology-curious employees, who explored different possibilities across business use cases.
“People who are thought leaders—who are wrapping their arms around AI—that’s where most of our initial agents came from,” recalls Soza. These employees took use cases and built custom instructions, creating over 100 ideas and experimental agents within a sandbox environment:
- Lead generation teams built research agents
- Sales created meeting preparation agents
- Marketing built content creation agents
- Engineering created a developer documentation agent
- People team built a recruiting kickoff agent
- Governance, Risk Management, and Compliance developed a policy extraction agent
Ideas and agents at this stage were still experimental and geared for individual use in sandbox environments—not yet production-ready for broader use.
Why this Phase Mattered
This phase proved essential for a couple of key reasons:
- Pattern recognition: With dozens of experiments running simultaneously, patterns emerged about where AI naturally fit versus where it struggled
- Cultural shift: Rather than fear AI, employees actively explored its possibilities
Ideation was just the beginning. It soon became clear that company-wide adoption of a few focused agents, potentially only two to three per team, would prove most valuable for impact.
“While there might have been this belief that you can just spin up your own agent and get going—yes, that’s true for some smaller tasks,” Box Chief Operating Officer Olivia Nottebohm explains. “But if you’re really trying to transform your business, you probably want to be a little more intentional about it.”
If you’re really trying to transform your business, you probably want to be a little more intentional about it.
Ready to identify your AI transformation priorities?
Phase 2: Pilots—Testing Big Bets
In the second, Pilot phase, Box handpicked “big bets” (ideas for agents that we believed would have the highest impact) across the company, and developed them for testing with a broader set of users in a production environment. We then intentionally deprioritized work on other agent ideas, to focus on ensuring these bets were a success.
“A lot of small agents—that people have not explicitly understood how to work into their workflows—leads to little impact,” Nottebohm says. “Having 10 people use an agent is not a win, no matter how good the agent is. You need 300 people to use two to three agents that are consistently helpful and can transform workflows.”
Not all ideas or agents developed in the ideation phase made it to pilot. Box purposely focused on those that we believed would have the greatest impact: those tackling use cases with a high degree of repeatability and critical thinking.
By driving a consolidation to a few big bets, we’ve unlocked several major benefits, including:
- More powerful agents: We focus on creating a few exceptional agents vs. many. We ensure individual agents work well, power complex use cases, and deliver on integrations with other systems and data sources.
- Time saved on building: We build once for many use cases, instead of creating multiple solutions for the same kind of challenge. Box examples include our account research agent (excels at analyzing customer data), our positioning agent (deeply understands value propositions), and our sales methodology agent (applies value-based selling principles).
- Greater team capacity for change: It’s much easier to adapt processes and learn to use two to three agents, versus 10 or more.
Simplicity as a security best practice
When agents access sensitive corporate data across multiple systems, security and compliance become exponentially more complex, requiring proper authentication, access controls, and audit trails.
“Our default to build in Box AI gives us a head start on this, because Box AI agents respect the underlying governance, security, and permissions of content that lives in Box,” explains Robert Ferguson, Box’s Head of Corporate Strategy & Chief of Staff to the CEO.
Box AI agents respect the underlying governance, security, and permissions of content that lives in Box.
However, to make agents even more powerful, we also want to bring in data from other systems.
Nottebohm adds: “The way to really get value out of these agents is when you start connecting them onto other things. For us, we wanted agents that could access our data in Google Cloud Platform, or Salesforce, not only data held in the Box platform.”
Teams rarely have both bandwidth and expertise to implement these integrations correctly and securely. A centralized approach to controlling integrations (and integrations requests) ensures agents maintain consistent security standards and governance protocols.
How we consolidated ideas into big bets to pilot
The consolidation process involved two key steps:
1. Identify overlapping capabilities across ideation phase agents
The consolidation process—done in cross-functional working sessions led by Functional AI Leaders like Soza, with the help of the Design & Build team—combed through all agents to spot areas of overlap. In many cases, upwards of 10 similar agents would be merged into a couple of more robust versions.
Example: Instead of separate agents for feature adoption, Box Hubs adoption, and Box Sign adoption, Box built one feature adoption toolkit that could apply positioning, use case discovery, and adoption best practices to any feature.
Why? As Soza points out: “We’d underestimated AI’s ability to handle ambiguity. One well-coached agent could contextualize for any feature—we just hadn’t given it the chance.”
2. Prioritize high repeatability and critical thinking
Functional Leaders then used the prioritization matrix, covered in the first article in this series, to identify agents in the “sweet spot”—areas of high repeatability that require high levels of critical thinking—which tends to signal the highest ROI.
This helped force explicit prioritization: If you could only pick 2-3 initiatives, what would matter most?
“Where can you have an impact over what surface area—and what is the greatest lift?” Nottebohm says. “For your function, think about what your top two priorities are in terms of where you’re going to leverage AI.”

By consolidating ideas and experimental agents from the Ideation phase into big bets, we made it more likely that pilots would successfully turn into agents that drive significant impact for the company.
For your function, think about what your top two priorities are in terms of where you’re going to leverage AI.
Measuring big bets for success
In the Pilot phase, it was critical to set clear, measurable goals for agent performance. Teams adhered to a checklist that ensured a pilot truly passed success metrics.
Examples of measures include:
- Efficiency (e.g. hours saved) on marketing copy or recruiting packages
- Percentage of support tickets deflected to self-service
- Net new work, like AI-powered industry-specific meeting prep
Widespread and consistent use across user testing groups was also crucial. Not only should agents be effective at the job they’re asked to do, they should be easy to use (and people should want to use them).
Big bets that met key success metrics then received support from the Design & Build team for production-grade development and integration with other enterprise systems.
You can download your own copy of the 2x2 prioritization matrix, and the “big bets” planning template used internally at Box, here.
Ready to identify your AI transformation priorities?
Phase 3: Rollout Preparation—Ensuring Agents are Production Ready
Agents that prove performance in phase two move into the “rollout preparation” phase, where we focus on ensuring the agent is production-ready for rollout to all relevant users.
This requires:
- Minimum thresholds of reliability and accuracy
- Accurate, consistent knowledge management/data pipelines (a dedicated effort—agents are only as good as the content they reference)
- Governance and safety guidelines that define who can access the agent (we had a head start on this, since agents built in Box automatically respect underlying user data permissions)
From there, an agent is released to a full set of users with clear communication on how to use it, and the organization continues gathering feedback to make refinements.
Big bets vs. hero agents
Big bet: An idea (selected for development as a pilot) for its potential impact
Hero agent: A piloted agent that’s proven to be high impact
Phase 4: Scaled Adoption—Embedding Agents into Daily Use
Phase 4 focuses on change management aimed at driving 100% adoption and consistent use of a few key “hero” agents. Where big bets are ideas selected for development as a pilot for its potential impact, hero agents are piloted agents (and former big bets) with a proven ability to drive high impact.
Agents given “hero” status should be deemed be useful for a large number of people. Leveraged consistently, these agents deliver productivity, more effective work, and net new work that wouldn’t have been done before (e.g., our Intelligent Prospecting Agent allowing SDRs to scale outreach at unprecedented levels).
Once an agent has been designated a hero agent, leadership backs it with change management support, training, and enablement at scale. This process is coming to life across a number of Box agents, including our marketing department’s “Bill of Materials” agents, which are undergoing pressure testing for blogs, social media, and other tasks. So far, they’ve accelerated email writing, enablement, and social content creation with consistency and accuracy.

Driven by Functional Leaders, this phase involves redesigning existing departmental workflows around agents, developing and deploying training to relevant employees, setting OKRs, and tracking adoption. Functional leaders take on the role of communicating that agent use is the new normal.
These leaders set targets based on expected use (e.g. 100% of AEs use a “meeting prep” agent for every meeting). That way, teams remain accountable for leveraging the power of new agents.
AI Managers, appointed by Functional Leaders, are then tasked with developing training materials and enabling the team to use the new agents. AI Managers are also responsible for recognizing success in adopting agents.
“Recognition is an important part of change management,” says Soza. “People are excited to share the progress of really good agents and success behind that. A big part of this is we want to celebrate the people who lead the pack.”
Ready to identify your AI transformation priorities?
A Continuous Loop
There’s no finish line in developing and launching agents, and that’s by design.
“Just because we’ve begun rolling out and scaling several agents, we’re not saying ideation stops,” explains Ferguson.” “This will be a continual process with regular touchpoints every month reviewing what’s being piloted, what’s scaling, what’s working, and what’s not.”
Just because we’ve begun rolling out and scaling several agents, we’re not saying ideation stops.
Some pilots reveal they need different technical approaches. Others show the business goal needs redefinition. The best ideas move forward, but there’s constant feedback and iteration. As grassroots innovation continues to “feed the hopper,” Box’s let-anyone-build-anything philosophy has transformed into targeted investment flows, supported by functional leaders and AI Managers who select priorities to pursue.
“Bottom-up innovation is always going to produce better results than top-down,” Malick adds. “Then you have the framework to say, ‘Okay, we’ve identified the really good use cases. Now we need to scale that.’”
The Operating Rhythm
The loop operates at quarterly and monthly cadences:
- Monthly: Review pilot progress, surface new ideas, identify what’s working
- Quarterly: Reassess big bet priorities (including graduation of new ideas), reallocate resources, and celebrate scaled wins
- Ongoing: Technology evolves rapidly; maintain mechanisms for continuous learning
This operating rhythm prevents the common mistake of treating AI transformation as a project with an end date. It’s instead an organizational muscle you build and maintain.
Takeaways for Your Organization
- Start with permission, not mandates. Let teams experiment broadly before imposing structure. You need empirical evidence about where AI creates value.
- Remove the builder barrier. Separate business needs from technical implementation. Anyone should be able to suggest use cases.
- Consolidate ruthlessly. After experimentation, merge overlapping agents. Fewer, better-coached hero agents outperform many mediocre ones.
- Transform workflows, not just tasks. The biggest gains come from reimagining complete processes, not automating individual steps.
- Empower functional leaders as AI Managers. Give them ownership of their function’s AI transformation, supported by central expertise and resources.
- Make it continuous. AI transformation isn’t a project with an end date. Build monthly and quarterly review rhythms to maintain momentum.
Plot your big bets for the coming year. Download your resource pack now.
In our next article, we’ll dive deep into the ideation phase—how to surface internal pain points, enable bottom-up experimentation, and prioritize high-repeatability, high-critical-thinking use cases.




