Writing a Box agent with LangChain: Easier than you think

|
Share
Writing a Box agent with LangChain: Easier than you think

In this post, I’ll show you how easy it is to build a Box agent using LangChain’s tool interface. With just a few lines of Python, you can give an AI model the ability to search your Box content, read files, extract structured data, and even ask Box AI to summarize or analyze documents — all wrapped inside an agent that can reason step-by-step using LangGraph.

If you’ve looked into the Model Context Protocol (MCP) and thought, “Hey, this is just a structured way to plug in data sources for an AI model” — you’re right. Writing an MCP server to connect Claude to Box content is conceptually no different than building an agent using LangChain tools. In both cases, you’re wrapping API calls into a format that the AI can call during a conversation.

For comparison take a look at our MCP article. 

All the code discussed in this article is available at our community GitHub repository.

Here is a video about this content:

Defining Tools in LangChain

In LangChain, a “tool” is just a function that the agent is allowed to call. These functions are wrapped with metadata — like descriptions and argument types — so the language model can decide when and how to use them.

The magic is in StructuredTool.from_function(), which lets you define tools straight from regular Python functions. Let’s look at a real example from our Box agent:

StructuredTool.from_function(
    self.box_search_tool,
    parse_docstring=True,
)

This wraps box_search_tool as an agent-callable function. The parse_docstring=True flag tells LangChain to automatically extract argument names and descriptions from the function’s docstring. That’s what makes it so easy to define clean, explainable tools.

Here’s what the underlying function looks like:

def box_search_tool(
    self,
    query: str,
    file_extensions: List[str] | None = None,
    where_to_look_for_query: List[str] | None = None,
    ancestor_folder_ids: List[str] | None = None,
) -> str:
    """Searches for files in Box using the specified query and filters.

    Args:
        query (str): The search query.
        file_extensions (List[str] | None): A list of file extensions to filter results (e.g., ['.pdf']).
        where_to_look_for_query (List[str] | None): Where to search for the query. Options:
            NAME, DESCRIPTION, FILE_CONTENT, COMMENTS, TAG
        ancestor_folder_ids (List[str] | None): Folder IDs to limit the search scope.

    Returns:
        str: A formatted string containing the search results.
    """

LangChain reads this docstring and builds a spec that the agent can use in natural language:

Search for files in Box where the name contains ‘Q2 report’, but only inside the ‘Finance’ folder and only look at PDF files.

Behind the scenes, this tool calls the box_search API and formats the results nicely:

search_results = box_search(
    self.client, query, file_extensions, content_types, ancestor_folder_ids
)

# Return id, name, and description for each match
search_results = [
    f"{file.name} (id:{file.id})"
    + (f" {file.description}" if file.description else "")
    for file in search_results
]

The output is just a plain string — easy for the model to read and reason with. No special formatting needed.

What makes this easy

You don’t need to write OpenAPI specs, JSON schemas, or special wrappers.
If your function has a good docstring, you already have a usable tool.
LangChain handles the conversion from user input → tool args → function call.

You can repeat this pattern for any Box operation — reading files, listing folder content, calling Box AI, etc. The whole agent is just a collection of wrapped Python functions.

Instantiating the Box Agent

Once you’ve defined your tools, the next step is to create the actual agent — this is the piece that uses a language model to reason about what tool to call and when.

In this example, we use LangChain’s create_react_agent helper to build a simple ReAct-style agent (reasoning + acting) with a memory checkpoint. This agent can hold conversations, make decisions, and call the tools you’ve registered—all without writing any custom logic.

Here’s the key part:

memory = MemorySaver()
self.chat = create_react_agent(model, self.tools, checkpointer=memory)
  • model is any supported BaseChatModel, like Claude or GPT-4.
  • self.tools is the list of functions you registered earlier.
  • MemorySaver handles memory checkpoints in memory—great for quick experiments or stateless apps.

Once this runs, your agent is ready to go. It knows:

  • What tools are available
  • How to call each one (thanks to the docstrings and function signatures)
  • When to ask follow-up questions, invoke tools, or just respond directly

That’s it. You don’t need to write a custom planner, router, or tool selector — LangChain handles all of that for you.

Here’s how the full constructor looks:

def __init__(self, client: BoxClient, model: BaseChatModel):
    self.client = client
    self.tools = [
        StructuredTool.from_function(self.box_who_am_i, parse_docstring=True),
        StructuredTool.from_function(self.box_search_tool, parse_docstring=True),
        StructuredTool.from_function(self.box_read_tool, parse_docstring=True),
        StructuredTool.from_function(self.box_ask_ai_tool, parse_docstring=True),
        StructuredTool.from_function(self.box_search_folder_by_name, parse_docstring=True),
        StructuredTool.from_function(self.box_ai_extract_data, parse_docstring=True),
        StructuredTool.from_function(self.box_list_folder_content_by_folder_id, parse_docstring=True),
    ]

    memory = MemorySaver()
    self.chat = create_react_agent(model, self.tools, checkpointer=memory)

Why this is so simple
LangChain create_react_agent() is designed for rapid prototyping. It handles:

  • Parsing tool inputs from natural language
  • Calling the right tool with the right args
  • Maintaining a memory chain of intermediate steps

You don’t have to configure anything beyond your tool list and a language model. This makes it perfect for writing Box agents quickly — and is also why writing an MCP server feels very similar: you’re exposing a set of capabilities and letting the LLM decide how to use them.

Calling the Agent

Now that your Box agent is instantiated, you can start having real conversations with it. For example:

def test_agent_tools_box_search(box_client_ccg: BoxClient, chat_config: str):
    client: BoxClient = box_client_ccg
    model = init_chat_model("gpt-4", model_provider="openai")
    box_agent = LangChainBoxAgent(client, model)
    response = box_agent.chat.invoke(
        {"messages": [HumanMessage(content="locate my hab-03-01 file by name")]},
        chat_config,
    )
    messages = response.get("messages", [])
    assert messages != []
    assert any("hab-03-01" in message.content.lower() for message in messages)

What’s happening behind the scenes
When you call invoke(), the agent:

  • Analyzes your message
  • Decides whether it needs to use one of the tools you defined
  • Calls the tool with arguments it extracts from your message
  • Continues reasoning with the tool’s output
  • Returns a final answer

It can even chain multiple tool calls if necessary.

For example, if you say:

“Find the ‘Q2 Budget’ folder and list all the PDFs in it.”

The agent might:

  • Call box_search_folder_by_name(“Q2 Budget”)
  • Extract the folder ID from the result
  • Call box_list_folder_content_by_folder_id(folder_id, is_recursive=True)
  • Filter the results to only show pdf files
  • Reply with a summary of matches

All without you having to explicitly code any of that logic.

Bonus: Debugging your Agent with LangGraph dev tools

LangGraph makes it super easy to debug your agent using the built-in dev command.

Writing a Box agent with LangChain: Easier than you think

Once you’ve initialized your Box agent, just expose the react_agent like so:

from box_ai_agents_toolkit import get_ccg_client
from langchain.chat_models import init_chat_model
from langchain_box_agent.box_agent import LangChainBoxAgent

client = get_ccg_client()
user_info = client.users.get_user_me()
print(f"Connected as: {user_info.name}")

model = init_chat_model("gpt-4", model_provider="openai")
box_agent = LangChainBoxAgent(client, model, use_internal_memory=False)

react_agent = box_agent.react_agent

The LangGraph configuration file:

{
  "dependencies": [
    "."
  ],
  "graphs": {
    "agent": "./src/box_agent_langgraph.py:react_agent"
  },
  "env": ".env"
}

Then from your terminal, run:

langgraph dev --config ./src/langgraph.json

This spins up an interactive debugger where you can inspect each step your agent takes — tool inputs, tool outputs, model calls, and memory state — all in a visual flow. It’s incredibly helpful for understanding why your agent behaves the way it does and catching subtle bugs early.

This dev experience is one of the things that makes LangGraph such a powerful orchestration layer for agents.

Wrapping Up

That’s all it takes to build a fully functioning Box agent with LangChain.

  • Define tools using simple Python functions
  • Register them with LangChain
  • Instantiate your agent with a model and memory
  • Call invoke() to handle any user query

If you’ve ever written a Model Context Protocol (MCP) server, this workflow will feel familiar — just with LangChain handling the orchestration for you.

Box’s developer toolkit makes it easy to expose AI capabilities through a consistent, secure API. LangChain gives you the reasoning engine to tie it all together. Together, they make building AI-native apps feel like writing regular Python code.