First Look: GPT-4.5 and Box AI

Today, OpenAI released its latest model, GPT-4.5, which will be available today for Box customers in the Box AI Studio. GPT-4.5 is a breakthrough new model from OpenAI, which has made major strides in coding, math, reasoning capabilities, and more. This makes it particularly potent for enterprise use-cases, where accuracy and integrity are mission critical.
Like previous releases from OpenAI, our testing shows that GPT-4.5 is one of the best models available both in terms of our eval scores and also its ability to handle many of the hardest AI questions that we have come across. It is not a chain-of-thought reasoning model like OpenAI o3-mini, but it offers impressive understanding and reasoning on a variety of subjects.
Here are some results from Box’s early testing of the model:
- GPT-4.5 offered a four percentage point increase in accuracy over GPT-4o on our Enterprise Document Q&A eval set.
- GPT-4.5 scored higher than many previous non-chain-of-thought model on questions involving mathematical calculations (for instance, answering questions about financial documents that involved both reasoning over data and then performing calculations to obtain an accurate gross margin calculation from the data when that exact number wasn’t in the document).
- GPT-4.5 performed better than GPT-4o on questions that required the model to group and filter facts and then answer questions about them.
- GPT-4.5 excelled in particular in math and date calculations, which older models struggled with.
Extracting more value out of your unstructured data with GPT-4.5
To further explore the capabilities of GPT-4.5, we focused on a key area with significant potential for enterprise impact: the extraction of structured data, or metadata extraction, from enterprise content. The ability to identify and retrieve structured information from unstructured content is one of the most transformative advancements in terms of how people work, enabling greater automation and more powerful workflows applicable to all employees.
At Box, we rigorously evaluate data extraction models using multiple enterprise-grade datasets. One key dataset we leverage is CUAD, which consists of over 510 commercial legal contracts. Within this dataset, Box has identified 20,000 fields that can be extracted from unstructured content and evaluated the model based on single shot extraction for these fields (this is our hardest test, where the model only has once chance to extract all the metadata in a single pass vs. taking multiple attempts). In our tests, GPT-4.5 correctly extracted 19 percentage points more fields accurately compared to GPT-4o, highlighting its improved ability to handle nuanced contract data.

To ensure GPT-4.5 could handle the demands of real-world enterprise content, we evaluated its performance against a more rigorous set of documents, Box’s own challenge set. We selected a subset of complex legal contracts – those with multi-modal content (data and text), high-density information, and lengths exceeding 200 pages – to represent some of the most difficult scenarios our customers face. On this challenge set, GPT-4.5 consistently outperformed GPT-4o in extracting key fields with higher accuracy, demonstrating its superior ability to handle intricate and nuanced legal documents.

These findings indicate that GPT-4.5 offers substantial improvements over GPT-4o in several critical areas for enterprise document understanding and processing.
Unlocking deeper insights with GPT-4.5
So what does this mean for you? Picture a normal day: you have a stack of research papers to go through, and let's be honest, you’re dealing with way too much information as it is, so getting through those files is an uphill battle. Instead of spending hours analyzing and comparing, imagine if you could just get the key facts you need, with one click. This type of work makes up the bulk of content managed in Box, and by leveraging GPT-4.5 with Box AI, we found it to be extremely helpful at deriving insights, with improved accuracy, when it comes to pulling out the important details from your documents.
Here, GPT-4.5 is able to dramatically streamline our example of a research review process by synthesizing documents stored in a Box Hub. Researchers can leverage its power to:
- Rapidly summarize: Instantly grasp the core findings of complex research papers.
- Extract key data: Quickly identify relevant data points, experimental setups, and results.
- Identify trends: Uncover emerging patterns and connections across multiple studies.
- Compare methodologies: Easily contrast different research approaches and their limitations.
- Synthesize information: Create the foundation for new hypotheses. This acceleration of the research process fuels faster breakthroughs and more informed scientific inquiry.
Box AI and GPT-4.5 across your organization
GPT-4.5 isn't just for accelerating the pace of literature review and fueling groundbreaking discoveries, its capabilities can also be applied to other departments across your organization. Let’s take a look at a few use cases you could create:
Legal: Imagine needing to instantly identify critical clauses or specific provisions buried deep within a lengthy contract. GPT-4.5 empowers legal teams to analyze documents with unparalleled speed and precision, ensuring nothing is missed.
Customer support: Customer support teams can resolve inquiries with greater efficiency by using GPT-4.5 to quickly pinpoint relevant information from customer documents and knowledge bases, leading to improved response times and increased customer satisfaction.
Sales: Sales teams can use GPT-4.5 to automatically generate concise summaries of contracts, highlighting key terms and potential risks, saving valuable time and improving deal closure rates.
Marketing: Marketing teams can leverage GPT-4.5 to analyze customer data and automatically generate highly targeted campaign materials, increasing engagement and ROI.
The next step in content analysis is here
Remember that scenario of instantly finding the key information you need? Now, with GPT-4.5, Box AI is even faster, even more accurate, and even more powerful. Experience the difference. Try Box AI and GPT-4.5 today.