
In today’s rapidly evolving AI landscape, there’s a growing need to connect large language models (LLMs) like Claude with enterprise systems and APIs. At Box, we’ve been exploring how to use these powerful AI capabilities to enhance our content management platform. This article walks through our journey of teaching Claude to interact directly with the Box API using the Model Context Protocol (MCP) framework.
What is Model Context Protocol?
The Model Context Protocol (MCP) is a framework designed to standardize how AI models interact with various data sources and services. It creates a bridge between LLMs and external systems, enabling them to access and manipulate data beyond their training datasets.
Read more information about MCP here.
Box MCP server
We developed this server as a community project to demonstrate how to expose Box’s API through the Model Context Protocol.
This open-source Python implementation provides a sample toolkit enabling Claude to seamlessly interact with documents stored in Box.
By bridging these technologies, we’ve created a foundation that allows AI assistants to access enterprise content management capabilities with minimal friction, opening up new possibilities for intelligent document handling and analysis.
Tools implemented
box_who_am_i- Get the current user's information and check connection statusbox_authorize_app_tool- Start the Box application authorization processbox_search_tool- Search for files in Box using queries, file extensions, and other parametersbox_read_tool- Read the text content of a specific Box filebox_ask_ai_tool- Ask Box AI questions about a file's contentbox_search_folder_by_name- Locate a folder in Box by its namebox_ai_extract_data- Extract structured data from a file using AIbox_list_folder_content_by_folder_id- List the contents of a folder
These tools were implemented using the simplest, most direct way to access the Box API.
It is interesting to witness how Claude probes and combines the different tools to accomplish a given task.
Authentication flow
Claude needs to be able to tell if we are connected to Box, and if not how to establish a connection.
In this context it makes more sense to implement Box OAuth, so we have to enable Claude to start the authorization process.
@mcp.tool()
async def box_authorize_app_tool() -> str:
"""
Authorize the Box application.
Start the Box app authorization process
return:
str: Message
"""
result = authorize_app()
if result:
return "Box application authorized successfully"
else:
return "Box application not authorized"Once the user completes this process, Claude can access Box using the user security context.
To check if a connection is working we use the traditional who_am_ai that will automatically attempt to login to Box with the existing credentials, returning who is currently logged in.
Search functionality
Box search has many parameters which is a good test to see how Claude would modify search for its intended purpose.
The search tool allows Claude to find files in Box:
@mcp.tool()
async def box_search_tool(
ctx: Context,
query: str,
file_extensions: List[str] | None = None,
where_to_look_for_query: List[str] | None = None,
ancestor_folder_ids: List[str] | None = None,
) -> str:
"""
Search for files in Box with the given query.
Args:
query (str): The query to search for.
file_extensions (List[str]): The file extensions to search for, for example *.pdf
content_types (List[SearchForContentContentTypes]): where to look for the information, possible values are:
NAME
DESCRIPTION,
FILE_CONTENT,
COMMENTS,
TAG,
ancestor_folder_ids (List[str]): The ancestor folder IDs to search in.
return:
str: The search results.
"""
# Get the Box client
box_client: BoxClient = cast(
BoxContext, ctx.request_context.lifespan_context
).client
# Convert the where to look for query to content types
content_types: List[SearchForContentContentTypes] = []
if where_to_look_for_query:
for content_type in where_to_look_for_query:
content_types.append(SearchForContentContentTypes[content_type])
# Search for files with the query
search_results = box_search(
box_client, query, file_extensions, content_types, ancestor_folder_ids
)
# Return the "id", "name", "description" of the search results
search_results = [
f"{file.name} (id:{file.id})"
+ (f" {file.description}" if file.description else "")
for file in search_results
]
return "\n".join(search_results)To my surprise, and with only the information above, Claude does use the parameters to modify the search, for example limiting it to PDF documents, ancestor folders, or looking for the query only in the document name.
Considering that Claude might get lost in the search results, we also implemented a way for it to walk the folder structure using the list_folder_content_by_folder_id , and a more specific and more constrained search_forder_by_name .
Read documents
A critical capability in our integration is giving Claude the ability to read content stored in Box. To accomplish this, we use the text representation feature, which extracts the textual content from various document formats.
When Claude needs to analyze a document, the MCP server requests this text representation through the Box API, providing a clean, normalized version of the document’s content.
This approach works seamlessly across multiple file types — including PDFs, Office documents, and plain text files — ensuring Claude can process and understand the information regardless of the original format.
@mcp.tool()
async def box_read_tool(ctx: Context, file_id: Any) -> str:
"""
Read the text content of a file in Box.
Args:
file_id (str): The ID of the file to read.
return:
str: The text content of the file.
"""
# log parameters and its type
logging.info(f"file_id: {file_id}, type: {type(file_id)}")
# check if file id isn't a string and convert to a string
if not isinstance(file_id, str):
file_id = str(file_id)
# Get the Box client
box_client: BoxClient = cast(
BoxContext, ctx.request_context.lifespan_context
).client
response = box_file_text_extract(box_client, file_id)
return responseAsk Box AI
We’ve included Box AI capabilities as well. There’s an interesting dynamic when Claude asks Box AI about document content. This approach works effectively and eliminates the need to transfer document content from Box to external platforms.
For this implementation, we configured the Box AI agent to use Claude, creating a scenario where Claude effectively communicates with itself but through different systems.
@mcp.tool()
async def box_ask_ai_tool(ctx: Context, file_id: Any, prompt: str) -> str:
"""
Ask box ai about a file in Box.
Args:
file_id (str): The ID of the file to read.
prompt (str): The prompt to ask the AI.
return:
str: The text content of the file.
"""
# log parameters and its type
logging.info(f"file_id: {file_id}, type: {type(file_id)}")
# check if file id isn't a string and convert to a string
if not isinstance(file_id, str):
file_id = str(file_id)
# Get the Box client
box_client: BoxClient = cast(
BoxContext, ctx.request_context.lifespan_context
).client
ai_agent = box_claude_ai_agent_ask()
response = box_file_ai_ask(box_client, file_id, prompt=prompt, ai_agent=ai_agent)
return responseDemo
Here are some usage examples to illustrate Claude operating with the Box API.
Using the Box server MCP
You can get this sample code directly from our community repo:
GitHub — box-community/mcp-server-box: An MCP server capable of interacting with the Box API
Follow the README.md on how to set up and integrate it with your Claude desktop application.
Conclusion
By connecting Claude to the Box API through the Model Context Protocol, we’ve created a system that combines the intelligence of LLMs with the robust content management capabilities of Box. This integration allows users to interact with their Box content and other sources, in more natural and intelligent ways, unlocking new possibilities for document management and analysis.
The MCP Server Box project demonstrates how enterprise systems can be extended with AI capabilities, creating more intuitive and powerful user experiences. As LLMs and content platforms continue to evolve, we expect to see even more innovative integrations that change how we interact with our content.


