The 90% problem: How Box cracked the code on enterprise’s biggest data blindspot

Every enterprise sits on a goldmine they can’t touch. Ninety percent of organizational data is trapped in documents: agreements holding critical terms no system understands, invoices processed one-by-one, and contracts gathering dust in shared drives.

A Fortune 500 company might have millions of these documents. At 10 minutes per document, a backlog of 100,000 files represents 16,000 hours of human labor. Now, imagine that slowdown spanning all kinds of content each month (a familiar scenario in fields such as accounting). Most companies simply can’t process it at all.

It’s not a scanning problem. OCR solved that decades ago. It’s not even an AI problem. Anyone can paste text into ChatGPT. It’s a trust problem.

When an invoice amount or contract term gets buried on page 47 of a PDF, you need certainty that your extraction is right. One misread decimal point can cost millions.

Until now, enterprises faced an impossible choice: hire armies of experts, attempt unreliable automation requiring constant verification, or let that 90% of data remain dark. Box Extract resolves that dilemma with multiple specialized AI agents that not only extract information from any document type, but tell you exactly how confident they are in every single field.

We spoke with Kelash Kumar, VP of Product Management at Box, about how Box Extract is transforming document intelligence from a manual bottleneck into an automated competitive advantage — and why the key isn’t just AI, but knowing when to trust it.

Key takeaways:

Box Extract turns organizational data into actionable, AI-ready information that can power business processes
Field-level confidence scoring builds trust by letting teams set thresholds, apply validation rules, and route low-confidence fields for retries or human review
No-code customization empowers users to tailor extraction agents, centrally manage processes, and handle varied content
Real-world use cases demonstrate improved speed and accuracy across onboarding, claims processing, and purchase order workflows
Future efforts will focus on automation tooling, validation, and global scaling to process millions of files

Why is extracting information from documents so important?

About 90% of organizational data is content — documents, data sheets, PDFs, marketing assets, videos. You can’t automate processes involving this content because you don’t know what’s inside it. It’s not AI-ready.

Historically, content has been incredibly difficult to work with because traditional techniques were inflexible, expensive and brittle. It required deep expertise to actually understand everything. This meant document processing was typically limited to high volume document types, like invoices, POs, claims, and bank statements.

With generative AI and LLMs, we can now understand the meaning of that content at a far deeper level. Companies are saying, “I can finally use this content to make decisions and automate my processes at scale.”

And that’s with Box Extract?

Yes. Box Extract reveals the context and information within your content, making it AI-ready for process automation. We’re structuring content to make it actionable — whether that’s automating existing processes or building entirely new AI-powered ones.

Look at contracts. With Box Extract, companies can extract everything from basic entities, like effective dates, to complex assessments of contractual risk and obligations.

Box Extract helps business users customize extraction agents, centrally manage all extraction processes, and achieve highly accurate outcomes. It builds on our extraction agents released earlier this year, which were API-only. Now anyone in an organization can customize our agents for their specific job or workflow — no coding required.

Can’t people just send their PDFs to an LLM and extract information that way?

Sure, you can use ChatGPT with a prompt like “extract invoice number, date, and amount.” But results vary wildly. Context may get lost as the document length becomes longer. The chatbot might provide ambiguous answers without explicit instructions. You get no confidence signals about which fields are trustworthy. LLMs typically flatten everything into a raw text, so good luck working with tables and different layouts. Some content may not even be recognized, because the PDF had a rasterized image. And when the model updates, your workflow could easily break.

How does Box Extract approach this differently?

Before any extraction begins, Box Extract analyzes a document’s layout to identify different content types — paragraphs, tables, handwriting, images, graphs. Specialized agents target the right regions: OCR agent to digitize text, a handwriting recognition agent for signatures, a chart analysis agent for graphs. Providing smaller, focused context to LLMs yields far better and more stable results.

The game-changer is field-level confidence scoring. In an enterprise setting, saying “this document extraction is 95% accurate” isn’t enough. You need to know which fields might be wrong.

How does confidence scoring work in practice?

Not all information carries equal weight. A payment amount is critical; a product title less so. Box Extract lets you express these priorities. Require 99% confidence for account numbers while accepting 90% for product descriptions. Specify validation rules — this should be a number, fall within a range, or match a customer ID in your database.

When the system returns 98% confidence on a field requiring 99%, it flags for human review. Low-confidence extractions can trigger agent reflection — the system retries with different strategies before escalating to a human. This delivers human-level performance because the system knows when to ask for help.

This is about trustworthiness. You shouldn’t have to check every number. You need to trust the process — but know exactly when human expertise is needed.

Kelash Kumar, VP of Product Management at Box

Can you give me a real example of how this works?

One of my favorite examples is a wealth management firm that uses Box to onboard clients. They collect 150-200 sensitive documents per customer — bank statements, tax information.

They need this to be gathered and stored securely, and extracted information needs to be accurate. Ultimately, they want to empower their wealth advisors to onboard as many clients as possible.

So within Box Extract, they first took those documents and categorized them into 15 document types, each with specific metadata templates defining what information to extract. They built a ground truth dataset — annotated documents with correct answers — to customize Box Extract for their specific content. Their automated workflows now enable wealth advisors to serve clients at a deeper level and provide a much faster turnaround.

What about other industries — how are they using this?

In insurance, teams use Extract to process claims and increase recovery rates by analyzing images and incident reports. Claims adjusters previously spent days reviewing hundreds of documents per claim — accident reports, medical records, repair estimates. With Box Extract’s confidence-based routing, complex claims get processed in hours.

We’re also seeing finance organizations within businesses transform things like purchase order management. Finance teams responsible for purchase orders faced a constant battle with manual data entry and late-caught discrepancies. Now, Box Extract helps them do that far more quickly and effectively.

Our mission is extracting context from content at massive scale with high accuracy — the widest range of document types with the highest accuracy possible.

Kelash Kumar, VP of Product Management at Box

What’s next?

We’re working on three key areas:

First, automation tools — helping you annotate data, build evaluation sets, and optimize prompts for reliable agentic systems.

Second, assessment and validation agents. Customers say, "I’d like to make an assessment based on this information." Imagine not just extracting that a contract has a liability clause, but having an agent assess the risk level. Not just pulling financial figures, but validating them against business rules. These agents would turbocharge the process.

Third is scale. It doesn’t work if it only works at a small scale. We’re talking about processing millions of files while maintaining accuracy. We’re delivering Box scale globally — whether you’re processing 100 documents or 100 million, whether you’re in San Francisco or Singapore, the system needs to deliver reliable outcomes.

The vision is clear: Get your context out of your content at massive scale and with high accuracy. Because until you can trust the system at scale, you haven’t solved the problem.

Note: A subset of features mentioned in this blog will be included at general availability, with more capabilities to follow.

Ready to dive deeper into BoxWorks? Get insights on all our announcements and new innovations in this event recap.