AI Agents with Box and Google MCP Toolbox for Databases

|
Share
AI Agents with Box and Google MCP Toolbox for Databases

One of the most intriguing aspects of AI agents is the ability to provide multiple data sources to your agent so that the agent can decide where to get the data it needs at the time it is needed. One of the key data types needed in this world is unstructured data, and that is where Box can step in to help you provide security for your most sensitive data while also allowing you to provide that context to your AI applications as you see fit.

One of the things we are seeing a lot is the combination of this unstructured data with structured data. Quite often this is in the form of key/value pair extraction from your Box content using our extract APIs. This data can then be used to feed your databases and applications, or combined with other data and attached back to the file in Box in the form of Metadata.

Because of this trend, we were intrigued when the news broke that the Google Database team was releasing GenAI Toolbox for Databases (renamed to MCP Toolbox for Databases), a set of easily configurable AI tools meant to help your agents interact with the database of your choice. So we sat down to build an application that employs an AI agent with this Toolbox and Box agents tools.

As usual, building the agent was straightforward. We used a combination of LangChain, the langchain-toolbox library, and the Box AI Agent Toolkit to build an AI agent that can talk to Box, the Toolbox, and Open AI gpt-4.5-preview. We wanted to find an interesting use case, and we settled on the idea of extracting data from a mortgage application in Box and using it to compile a property abstract based on data in our Postgresql database. You can find the code for this agent here.

To run this code yourself, there is some set up involved. Of course you need a Postgresql database. You can follow the instructions from Postgres to get that started.

Once installed, let’s set up our sample database.

1. Connect to postgres using the psql command:

psql -h 127.0.0.1 -U postgres

Here, postgres denotes the default postgres superuser.

2. Create a new database and a new user:

For a real application, it’s best to follow the principle of least permission and only grant the privileges your application needs.

CREATE USER property_manager WITH PASSWORD 'property_manager';
CREATE DATABASE property_db;
GRANT ALL PRIVILEGES ON DATABASE property_db TO property_manager;
ALTER DATABASE property_db OWNER TO property_manager;

3. End the database session:

\q

4. Connect to your database with your new user:

psql -h 127.0.0.1 -U property_manager -d property_db

5. Create 6 tables using the following commands:

properties table:

CREATE TABLE properties(
  id INTEGER NOT NULL PRIMARY KEY,
  address VARCHAR(100) NOT NULL,
  sq_ft INTEGER NOT NULL,
  property_type VARCHAR(20) NOT NULL
);

title_history table:

CREATE TABLE title_history(
  id INTEGER NOT NULL PRIMARY KEY,
  deed VARCHAR(100) NOT NULL,
  deed_date DATE NOT NULL,
  owner VARCHAR(100) NOT NULL,
  mortgage VARCHAR(100) NOT NULL,
  mortgage_date DATE NOT NULL,
  mortgage_amount INTEGER NOT NULL,
  property_id INTEGER,
  CONSTRAINT property_id_fkey FOREIGN KEY(property_id)
  REFERENCES properties(id)
);

legal_actions table:

CREATE TABLE legal_actions(
  id INTEGER NOT NULL PRIMARY KEY,
  case_description VARCHAR(100) NOT NULL,
  case_date DATE NOT NULL,
  case_parties VARCHAR(100) NOT NULL,
  property_id INTEGER,
  CONSTRAINT property_id_fkey FOREIGN KEY(property_id)
  REFERENCES properties(id)
);

liens table:

CREATE TABLE liens(
  id INTEGER NOT NULL PRIMARY KEY,
  liens VARCHAR(100) NOT NULL,
  lien_date DATE NOT NULL,
  lien_amount INTEGER NOT NULL,
  property_id INTEGER,
  CONSTRAINT property_id_fkey FOREIGN KEY(property_id)
  REFERENCES properties(id)
);

taxes table:

CREATE TABLE taxes(
  id INTEGER NOT NULL PRIMARY KEY,
  bill VARCHAR(100) NOT NULL,
  bill_date DATE NOT NULL,
  bill_amount INTEGER NOT NULL,
  property_id INTEGER,
  CONSTRAINT property_id_fkey FOREIGN KEY(property_id)
  REFERENCES properties(id)
);

easements table:

CREATE TABLE legal_actions(
  id INTEGER NOT NULL PRIMARY KEY,
  easement VARCHAR(100) NOT NULL,
  easement_date DATE NOT NULL,
  easement_description VARCHAR(100) NOT NULL,
  property_id INTEGER,
  CONSTRAINT property_id_fkey FOREIGN KEY(property_id)
  REFERENCES properties(id)
);

6. Insert data into the tables.

Populate properties table:

INSERT INTO properties (address, sq_ft, property_type)
VALUES ('123 Main St, Austin, TX 78701', 1000, 'Residential');

Populate title_history table:

INSERT INTO title_history (deed, deed_date, owner, mortgage, mortgage_date, mortgage_amount, property_id)
VALUES ('Deed 123456, County Recorders Office','2010–01–01','John Doe','Mortgage 78901, Bank of America','2015–05–15',100000, 1);

Populate legal_actions table:

INSERT INTO legal_actions (case_description, case_date, case_parties, property_id)
VALUES ('Case 901234, Travis County Court','2022–03–15','John Doe vs. Jane Doe',1);

Populate leins table:

INSERT INTO liens (lien, lien_date, lien_amount, property_id)
VALUES ('Lien 345678, City of Austin','2020–01–01',5000,1);

Populate taxes table:

INSERT INTO taxes (bill, bill_date, bill_amount, property_id)
VALUES ('Tax Bill 567890, Travis County Tax Office','2023–01–01',1000,1);

Populate easements table:

INSERT INTO easements (easement, easement_date, easement_description, property_id)
VALUES ('Easement 112233, Utility Company','2012–01–01','Utility lines',1);

7. End the database session:

\q

The code to provide the tools to your agent is pretty straight forward. First, using the Box AI Agent Toolkit, we provided a number of tools to our agent:

from dataclasses import dataclass
import json
import logging
import dotenv
import os
from typing import Dict, Iterable, List, Union
from box_ai_agents_toolkit import (
  File,
  Folder,
  SearchForContentContentTypes,
  get_ccg_client,
  box_file_get_by_id,
  box_file_text_extract,
  box_file_ai_ask,
  box_file_ai_extract,
  box_file_ai_extract_structured,
  box_folder_text_representation,
  box_folder_ai_ask,
  box_folder_ai_extract,
  box_folder_ai_extract_structured,
  box_search,
  box_locate_folder_by_name,
  box_folder_list_content
)
from box_ai_agents_toolkit.box_api import BoxFileExtended
from langchain_core.tools import tool
logger = logging.getLogger(__name__)
client = get_ccg_client()
@tool
def box_tool_file_get_by_id(file_id: str) -> File:
  """
  Get information about a file in Box by its ID.
 
  Args:
    file_id (str): The ID of the file to read.
  return:
    str: The Box File object.
  """
  return box_file_get_by_id(client, file_id=file_id)
@tool
def box_tool_file_text_extract(file_id: str) -> str:
  """
  Read the text content of a file in Box.
  Args:
    file_id (str): The ID of the file to read.
  return:
  str: The text content of the file.
  """
  return box_file_text_extract(client, file_id=file_id)
@tool
def box_tool_file_ai_ask(
  file_id: str, prompt: str
) -> str:
  """
  Ask box ai about a file in Box.
  Args:
    file_id (str): The ID of the file to read.
    prompt (str): The prompt to ask the AI.
  return:
    str: The text content of the file.
  """
  return box_file_ai_ask(client=client, file_id=file_id, prompt=prompt)
@tool
def box_tool_file_ai_extract(
  file_id: str, prompt: str
) -> dict:
  """
  Extract data from a file in Box from a string promptusing AI.
  Args:
    file_id (str): The ID of the file to read.
    fields (str): The fields to extract from the file.
  return:
    str: The extracted data in a json string format.
  """
  return box_file_ai_extract(client=client, file_id=file_id, prompt=prompt)
@tool
def box_tool_file_ai_extract_structured(
  file_id: str, fields_json_str: str
) -> str:
  """
  Extract data from a file in Box from a json string prompt using AI.
  Args:
    file_id (str): The ID of the file to read.
    fields_json_str (str): The fields to extract from the file.
  return:
    str: The extracted data in a json string format.
  """
  return box_file_ai_extract_structured(client=client, file_id=file_id, fields_json_str=fields_json_str)
@tool
def box_tool_folder_text_representation(
  folder_id: str,
  is_recursive: bool = False,
  by_pass_text_extraction: bool = False,
) -> Iterable[BoxFileExtended]:
  """
  Read the text content of a folder in Box.
  Args:
    folder_id (str): The ID of the folder to read.
    is_recursive (bool): Whether to read the folder recursively.
    by_pass_text_extraction (bool): Whether to bypass text extraction.
  return:
    Iterable[BoxFileExtended]: The text content of the files.
  """
  yield from box_folder_text_representation(client=client, folder_id=folder_id, is_recursive=is_recursive, by_pass_text_extraction=by_pass_text_extraction)
@tool
def box_tool_folder_ai_ask(
  folder_id: str,
  prompt: str,
  is_recursive: bool = False,
  by_pass_text_extraction: bool = False,
) -> Iterable[BoxFileExtended]:
  """
  Ask box ai about a folder in Box.
  
  Args:
    folder_id (str): The ID of the folder to read.
    prompt (str): The prompt to ask the AI.
    is_recursive (bool): Whether to read the folder recursively.
    by_pass_text_extraction (bool): Whether to bypass text extraction.
  return:
    Iterable[BoxFileExtended]: The text content of the files.
  """
  yield from box_folder_ai_ask(client=client, folder_id=folder_id, prompt=prompt, is_recursive=is_recursive, by_pass_text_extraction=by_pass_text_extraction)
@tool
def box_tool_folder_ai_extract(
  folder_id: str,
  prompt: str,
  is_recursive: bool = False,
  by_pass_text_extraction: bool = False,
) -> Iterable[BoxFileExtended]:
  """
  Extract data from a folder in Box from a string prompt using AI.
  Args:
    folder_id (str): The ID of the folder to read.
    prompt (str): The prompt to ask the AI.
    is_recursive (bool): Whether to read the folder recursively.
    by_pass_text_extraction (bool): Whether to bypass text extraction.
  """
  yield from box_folder_ai_extract(client=client, folder_id=folder_id, prompt=prompt, is_recursive=is_recursive, by_pass_text_extraction=by_pass_text_extraction)
@tool
def box_tool_folder_ai_extract_structured(
  folder_id: str,
  fields_json_str: str,
  is_recursive: bool = False,
  by_pass_text_extraction: bool = False,
) -> Iterable[BoxFileExtended]:
  """
  Extract data from a folder in Box from a json string prompt using AI.
  Args:
    folder_id (str): The ID of the folder to read.
    fields_json_str (str): The fields to extract from the folder.
    is_recursive (bool): Whether to read the folder recursively.
    by_pass_text_extraction (bool): Whether to bypass text extraction.
  """
  yield from box_folder_ai_extract_structured(client=client, folder_id=folder_id, fields_json_str=fields_json_str, is_recursive=is_recursive, by_pass_text_extraction=by_pass_text_extraction)
@tool
def box_tool_search(
  query: str,
  file_extensions: List[str] | None = None,
  content_types: List[SearchForContentContentTypes] | None = None,
  ancestor_folder_ids: List[str] | None = None,
) -> List[File]:
  """
  Search for files in Box with the given query.
  
  Args:
    query (str): The query to search for.
    file_extensions (List[str]): The file extensions to search for, for example *.pdf
    content_types (List[SearchForContentContentTypes]): where to look for the information, possible values are:
      NAME
      DESCRIPTION,
      FILE_CONTENT,
      COMMENTS,
      TAG,
    ancestor_folder_ids (List[str]): The ancestor folder IDs to search in.
  return:
    str: The search results.
  """
  # content_types: List[SearchForContentContentTypes] = [
  # SearchForContentContentTypes.NAME,
  # SearchForContentContentTypes.DESCRIPTION,
  # # SearchForContentContentTypes.FILE_CONTENT,
  # SearchForContentContentTypes.COMMENTS,
  # SearchForContentContentTypes.TAG,
  # ]
  return box_search(client=client, query=query, file_extensions=file_extensions, content_types=content_types, ancestor_folder_ids=ancestor_folder_ids)
@tool
def box_tool_locate_folder_by_name(
  folder_name: str, parent_folder_id: str = "0"
) -> List[Folder]:
  """
  Locate a folder in Box by its name.
  
  Args:
    folder_name (str): The name of the folder to locate.
  return:
    str: The folder ID.
  """
  return box_locate_folder_by_name(client=client, folder_name=folder_name, parent_folder_id=parent_folder_id)
@tool
def box_tool_folder_list_content(
  folder_id: str, is_recursive: bool = False
) -> List[Union[File, Folder]]:
  """
  List the content of a folder in Box by its ID.
  
  Args:
    folder_id (str): The ID of the folder to list the content of.
    is_recursive (bool): Whether to list the content recursively.
  return:
  str: The content of the folder in a json string format, including the "id", "name", "type", and "description".
  """
  # fields = "id,name,type"
  return box_folder_list_content(client=client, folder_id=folder_id, is_recursive=is_recursive)

We then created a tools.yaml that maps tools to SQL statements.

sources:
  my-pg-source:
    kind: postgres
    host: 127.0.0.1
    port: 5432
    database: property_db
    user: property_manager
    password: property_manager
tools:
  get-property-by-address:
    kind: postgres-sql
    source: my-pg-source
    description: Search for a property based on the address.
    parameters:
    - name: address
      type: string
      description: The address of the property.
      statement: SELECT * FROM properties WHERE address = $1;
  get-title-history:
    kind: postgres-sql
    source: my-pg-source
    description: get the title history for a property.
    parameters:
    - name: address
      type: string
      description: The address of the property.
      statement: SELECT th.deed, th.deed_date, th.owner, th.mortgage, th.mortgage_date, th.mortgage_amount FROM properties p INNER JOIN title_history th ON p.id = th.property_id WHERE p.address = $1;
  get-liens:
    kind: postgres-sql
    source: my-pg-source
    description: >-
      Get liens for a property.
    parameters:
    - name: address
      type: string
      description: The address of the property.
      statement: SELECT l.lien, l.lien_amount, l.lien_date FROM properties p INNER JOIN liens l ON p.id = l.property_id WHERE p.address = $1;
  get-legal-actions:
    kind: postgres-sql
    source: my-pg-source
    description: >-
      Get legal actions for a property.
    parameters:
    - name: address
      type: string
      description: The address of the property.
      statement: SELECT la.case_description, la.case_date, la.case_parties FROM properties p INNER JOIN legal_actions la ON p.id = la.property_id WHERE p.address = $1;
  get-tax-history:
    kind: postgres-sql
    source: my-pg-source
    description: Get tax history for a property.
    parameters:
    - name: address
      type: string
      description: The address of the property.
      statement: SELECT t.bill, t.bill_date, t.bill_amount FROM properties p INNER JOIN taxes t ON p.id = t.property_id WHERE p.address = $1;
  get-easements:
    kind: postgres-sql
    source: my-pg-source
    description: Get easements for a property.
    parameters:
    - name: address
      type: string
      description: The address of the property.
      statement: SELECT e.easement, e.easement_date, e.easement_description FROM properties p INNER JOIN easements e ON p.id = e.property_id WHERE p.address = $1;

And finally, we started the Toolbox. This command may vary depending on how you installed it. For us, we installed it with curl and it lives in the same directory as our code, so we started it at the commandline with the shell command:

./toolbox - tools_file "tools.yaml"

Tying it all together, we need to create our agent, grab the tools, gather the prompts, and provide them to the agent.

from typing import List, Sequence, cast

from langgraph.prebuilt import create_react_agent
from langgraph.checkpoint.memory import MemorySaver

from langchain_openai import ChatOpenAI
from langchain_core.tools import BaseTool

from toolbox_langchain import ToolboxClient

from box_tools import (
  box_tool_file_get_by_id,
  box_tool_file_text_extract,
  box_tool_file_ai_ask,
  box_tool_file_ai_extract,
  box_tool_file_ai_extract_structured,
  box_tool_folder_text_representation,
  box_tool_folder_ai_ask,
  box_tool_folder_ai_extract,
  box_tool_folder_ai_extract_structured,
  box_tool_search,
  box_tool_locate_folder_by_name,
  box_tool_folder_list_content
)

prompt = """
  You're a helpful real estate researcher. You research properties and
  provide information about them. You get an address from a mortgage
  application and research the history of the property. You then prepare
  a property abstract report for the mortgage application.
"""

queries = [
  "Can you find the mortgage application in my 'mortgage documents' folder?",
  "Can you extract the name, address, and price of the property from the mortgage application?",
  "Can make sure the address is a property in our database?",
  "What is the title history for this property?",
  "What are the liens for this property?",
  "What are the legal actions for this property?",
  "What is the tax history for this property?",
  "What are the easements for this property?",
  """
    Please generate a property abstract report for this property.
    The report should include the following information:
    - Name of the property
    - Address of the property
    - Price of the property
    - Title history of the property
    - Lien history of the property
    - Legal actions of the property
    - Tax history of the property
    - Easements of the property
    The report should be in a format that is easy to read and understand.
  """,
]

def main():
  model = ChatOpenAI(model="gpt-4.5-preview")

  # Load the tools from the Toolbox server
  client = ToolboxClient("http://127.0.0.1:5000")
  toolbox_tools = client.load_toolset()

  box_tools = [
    box_tool_file_get_by_id,
    box_tool_file_text_extract,
    box_tool_file_ai_ask,
    box_tool_file_ai_extract,
    box_tool_file_ai_extract_structured,
    box_tool_folder_text_representation,
    box_tool_folder_ai_ask,
    box_tool_folder_ai_extract,
    box_tool_folder_ai_extract_structured,
    box_tool_search,
    box_tool_locate_folder_by_name,
    box_tool_folder_list_content
  ]

  tools: Sequence[BaseTool] = cast(List[BaseTool], toolbox_tools) + box_tools

  agent = create_react_agent(model, tools, checkpointer=MemorySaver())
  config = {"configurable": {"thread_id": "thread-1"}}

  for query in queries:
    print(f"{query}\n\n")
    inputs = {"messages": [("user", prompt + query)]}
    response = agent.invoke(inputs, stream_mode="values", config=config)
    print(f"{response["messages"][-1].content}\n\n")

if __name__ == "__main__":
  main()

The trickiest part of this whole thing was combining our agent tools and the Toolbox database tools. This is because the Toolbox returns a typed list of ToolboxTools, where as the Box tools are in the form of a typed list of BaseTool. We were able to combine them by casting the ToolboxTool type to BaseTool.

tools: Sequence[BaseTool] = cast(List[BaseTool], toolbox_tools) + box_tools

Here is a sample of the completed property abstract:

## Property Abstract Report

### Property Information
- **Name:** Albert Einstein
- **Address:** 123 Main St, Austin, TX 78701
- **Property Price:** $300,000
- **Property Type:** Residential
- **Square Footage:** 1,000 sq ft
- -

### Title History
- **Owner:** John Doe
- **Deed:** Deed 123456, recorded at the County Recorder's Office
- **Deed Date:** January 1, 2010
- **Mortgage:** Mortgage 789012, Bank of America
- **Mortgage Amount:** $100,000
- **Mortgage Date:** May 15, 2015
- -

### Lien History
- **Lien:** Lien 345678, filed by the City of Austin
- **Lien Amount:** $5,000
- **Lien Date:** January 1, 2020
- -

### Legal Actions
- **Case Number:** Case 901234, Travis County Court
- **Parties Involved:** John Doe vs. Jane Doe
- **Case Date:** March 15, 2022
- -

### Tax History
- **Tax Bill Number:** Tax Bill 567890, Travis County Tax Office
- **Amount:** $1,000
- **Date Issued:** January 1, 2023
- -

### Easements
- **Easement Record:** Easement 112233, Utility Company
- **Easement Date:** January 1, 2012
- **Description:** Utility lines easement for maintenance and infrastructure access
```

Let us know what you think, and happy coding!

🦄 Want to engage with other Box Platform champions?

Join our Box Developer Community for support and knowledge sharing!