Building AI-Powered Document Generation with Box MCP and Pydantic AI

|
Share
Building AI-Powered Document Generation with Box MCP and Pydantic AI

This article demonstrates how to build an intelligent agent that uses Box’s document generation capabilities through the Box MCP server, all powered by Pydantic AI.

Prerequisites

Before diving in, ensure you have:

  • Python 3.11 or higher
  • A Box account with appropriate permissions
  • An OpenAI API key
  • Basic familiarity with Python development

Project setup

Let’s start by setting up our environment and installing the necessary dependencies:

# Clone the repository
git clone https://github.com/box-community/doc-gen-pydantic-ai-box-mcp-server.git
cd doc-gen-pydantic-ai-box-mcp-server

# Set up the environment using uv
uv lock
uv sync

# Create a .env file with your OpenAI API key
echo "OPENAI_API_KEY = sk-YOUR_API_KEY" > .env

Additionally, you’ll need to create a folder in your Box account named OpenAI Doc Gen where we’ll store our templates and generated documents.

Understanding the code structure

Our project consists of several key components:

  1. demo.py: The main script orchestrating the agent’s interactions
  2. Data files: Template document (nda_template.docx) and JSON data (NDA.json)

Creating the Agent

The heart of our application is the Pydantic AI agent that interfaces with the Box MCP server. Let’s examine the key components:

async def main() -> None:
    # Set up the MCP server connection for Box
    mcp_box = MCPServerStdio(
        command="uv",
        args=[
            "--directory",
            "/Users/rbarbosa/Documents/code/python/box/mcp-server-box",
            "run",
            "src/mcp_server_box.py",
        ],
    )
    # Initialize the OpenAI model
    model = OpenAIModel("gpt-4.1-mini", provider=OpenAIProvider())

    # Create the agent with access to the Box MCP server
    agent = Agent(
        model,
        system_prompt=(
            "
            You are a Box Agent. 
            Your job is to answer questions and complete actions with Box 
            using the tools available."
        ),
        mcp_servers=[mcp_box],
    )

This code establishes:

  1. A connection to the Box MCP server
  2. An OpenAI language model (GPT-4.1-mini)
  3. A Pydantic AI agent configured to work with Box tools

Document generation

The agent follows this workflow to generate documents:

  1. Authentication: Verify Box credentials
  2. Upload template: Transfer a template file to Box
  3. Mark as template: Configure the file as a document generation template
  4. Process data: Upload and process JSON data
  5. Generate document: Create the final document by merging template with data

Let’s see how this looks in action:

prompt = f"""
    Upload this local file {os.path.abspath(TEMPLATE)} to the Box 
    Folder called OpenAI Doc Gen and mark it as a doc gen template.
    Wait a few seconds for the doc gen tags to be processed by Box.
    Then upload the data file {os.path.abspath(DATA)} to the same folder, 
    and generate a new document with the template using the data file.
    """
prompt = textwrap.dedent(prompt)
async with agent.run_mcp_servers():
    result = await agent.run(prompt)

This natural language instruction is all it takes to:

  • Upload our NDA template to Box
  • Mark it as a document generation template
  • Upload the JSON data containing customer information
  • Generate a personalized NDA document

Data structure

Our JSON data file provides the information needed to populate the template:

{
  "contract": {
    "customerID": "12345678",
    "date": "03-18-2025",
    "customerName": "Bob Smith",
    "customerAddress": {
      "street": "9876 S 27th St",
      "city": "Dallas",
      "zip": "77777",
      "state": "TX"
    }
  },
  "file_name": "Bob_Smith_NDA.pdf"
}

Behind the scenes

When our agent runs, it translates natural language instructions into specific Box API operations through the MCP server. The print_tools_used function reveals which Box operations are executed:

def print_tools_used(agent_result: AgentRunResult):
    all_messages = agent_result.all_messages()
    for message in all_messages:
        for part in message.parts:
            if part.part_kind == "tool-call":
                type_writer_effect_machine(
                    f"  {part.tool_name}", is_dim=True, delay=0.01
                )

This function helps us understand which Box tools are being employed during the document generation process.

The generated file in Box

The end result of this exercise is a generated file, combining the template with the JSON data we provided. The agent was able to trigger the Doc Gen Box tool through the MCP server.

Building AI-Powered Document Generation with Box MCP and Pydantic AI

Advantages of the Pydantic AI approach

Using Pydantic AI with the Box MCP server offers several benefits:

  1. Natural language interface: No need to write complex Box API code
  2. Automated workflows: Chain multiple operations together seamlessly
  3. Flexible document generation: Easily adapt to different templates and data structures
  4. Improved developer experience: Reduce boilerplate and focus on business logic

Conclusion

By combining Pydantic AI with Box’s document generation capabilities, developers can create sophisticated document automation workflows with minimal code. The approach demonstrated here not only simplifies the technical implementation but also opens up possibilities for more complex document generation scenarios.

Whether you’re building internal tools, customer-facing applications, or enterprise workflows, this pattern provides a flexible foundation that can grow with your needs.

Resources

Happy coding!