Filesystems as the context layer for AI agents — powered by Box

If you’ve built anything beyond a toy AI agent, you’ve probably run into this moment: The model works, the prompt seems fine, the tools are wired up correctly — and yet the system starts to feel fragile.

As soon as you introduce multi-step workflows, planning, memory, or document generation, context becomes the real challenge. You can add retrieval, expand the prompt, or bolt on more tools, but at some point what you really need isn’t more intelligence. You need structure.

One abstraction that’s quietly proving to be incredibly effective for agents is something very familiar: the file system.

One filesystem. Multiple backends.

Christian Bromann from LangChain recently shared a Deep Agents demo that really resonated with me. The idea is deceptively simple: The agent sees a single filesystem, but under the hood, that filesystem is composed of multiple backends. In his example:

/docs/ is backed by cloud storage
/memories/ is backed by SQLite where relational data is synthesized into files
/workspace/ maps to local disk for generated outputs.

From the agent’s perspective, it’s all just a directory tree. It uses standard filesystem operations like ls, read_file, and write_file, and the backend layer handles the translation.

We forked that project and swapped the cloud storage layer for a Box-backed filesystem to explore what this pattern looks like in an enterprise context. Not because the original storage choice was wrong — it absolutely works — but because we wanted to see what happens when that filesystem abstraction becomes secure, governed, and collaborative by default.

Why filesystem-based context just works

One of the subtle strengths of Deep Agents is how it treats context. Instead of stuffing everything into a prompt or relying entirely on retrieval, the agent navigates a filesystem. It browses directories, opens specific files, writes drafts, and re-reads its own outputs. The reasoning becomes incremental and structured.

The SQLite backend in the original demo is a great illustration of this idea. There aren’t actually user profile “files” stored anywhere. Paths like /memories/users/sarah-chen.json are generated dynamically from relational tables. Rows become JSON. Table joins become readable Markdown summaries. The agent never sees SQL — it simply reads files.

That’s the key insight: A filesystem doesn’t have to mean literal files on disk. It’s an interface contract. The agent only needs to understand how to navigate files.

Beyond local disk

For prototypes, a local filesystem works beautifully with this pattern. But as soon as you move into real enterprise workflows, the limitations start to show.

Local filesystems are personal rather than collaborative. They aren’t permission-aware in a way that maps cleanly to organizational access models. They don’t provide governance controls or audit trails. And they’re not designed to support shared, cross-team workflows.

If your agent is generating proposals, summarizing sensitive documents, or drafting policy updates, those constraints matter.

The interesting part is that the agent itself doesn’t need to change. If it depends only on filesystem semantics, you can upgrade the filesystem layer underneath it — without rewriting your reasoning logic.

Swapping in Box as the filesystem layer

In our fork, we implemented a BoxBackend that plugs directly into Deep Agents’ BackendProtocol. From the agent’s perspective, nothing changes. It still calls filesystem tools like ls, read_file, and write_file. The abstraction contract stays the same. What changes is what happens underneath.

The original demo uses S3 for this layer, which works perfectly well for cloud-backed storage. In our fork, we explored what it looks like when that same abstraction is backed by Box.

A filesystem read operation becomes a path resolution step followed by a Box file download. The agent might request something like /docs/pricing.md, but under the hood we resolve that virtual path to a Box file ID and fetch its contents via the Box API:

async read(filePath: string): Promise<string> {
  const resolved = await this.resolvePath(filePath);
  if (!resolved || resolved.type !== "file") {
    return `Error: File '${filePath}' not found`;
  }

  const content = await this.downloadFileContent(resolved.id);
  return content ?? "";
}

The agent never sees file IDs. It never sees API calls. It just reads files.

Writes follow the same pattern. If a file already exists, we upload a new version instead of silently overwriting it — which means agent-generated artifacts automatically inherit version history:

async upsertFile(filePath: string, content: string) {
  const existing = await this.resolvePath(filePath);

  if (existing && existing.type === "file") {
    await this.client.uploads.uploadFileVersion(existing.id, {
      attributes: { name: filePath.split("/").pop()! },
      file: stringToByteStream(content),
    });
    return;
  }

  return this.uploadFile(filePath, content);
}

From the agent’s point of view, it’s just writing a file. From the system’s point of view, we’re enforcing folder scoping, permission-aware access, and versioned writes inside Box.

That’s the power of the abstraction. The agent still thinks it’s interacting with a filesystem. In reality, it’s interacting with a governed, collaborative, enterprise-grade content layer.

In other words, you’re giving your agent a cloud filesystem that’s secure, collaborative, and governed — without changing how the agent works.

What this pattern enables

Once /docs/ is backed by Box instead of local disk, the pattern becomes much more interesting.

An agent can draft a proposal using governed company documentation and write the output directly into a shared workspace. It can respect existing permission models automatically. It can operate within audit and compliance boundaries without prompt-level logic trying to simulate access control.

Because versioning is built in, agent-generated artifacts don’t just overwrite previous work — they evolve. Because collaboration is native, those artifacts don’t live on someone’s laptop. They live in shared spaces where teams already work.

The key thing is that none of this required changing the agent itself. We didn’t rewrite prompts. We didn’t bolt on custom governance logic. We changed the filesystem layer. That separation of concerns is what makes this pattern durable.

Code walkthrough

At a high level, the architecture stays very close to the original Deep Agents example. The difference is simply what backs each path.

We use a CompositeBackend to route different directory prefixes to different storage systems:

const boxBackend = new BoxBackend({ rootFolderId: "0" });
await boxBackend.ensureRootFolder("deep-agents-docs");
await boxBackend.warmCache();

const backend = new CompositeBackend({
  "/workspace/": new FilesystemBackend({
    rootDir: "./workspace",
    virtualMode: true,
  }),
  "/memories/": new SQLiteBackend({
    dbPath: "./data/memories.db",
  }),
  "/docs/": boxBackend,
});

From there, creating the agent looks familiar if you’ve used Deep Agents before. You provide:

A model
A checkpointer for long-running memory
A system prompt that explains where different types of information live
The composite backend

const agent = createDeepAgent({
  model,
  checkpointer,
  systemPrompt,
  backend,
});

Now the agent sees a single unified filesystem:

/docs/ → Box
/memories/ → SQLite (virtual files generated from relational data)
/workspace/ → Local disk

When we invoke the agent with a prompt like:

await agent.invoke(`
Generate a personalized sales proposal for Sarah Chen.
Read relevant documents from /docs/.
Use customer data from /memories/.
Write the final proposal to /workspace/sarah-chen-proposal.md.
`);

…it navigates that filesystem exactly as instructed. It lists files, opens documents, reads synthesized customer history from SQLite, drafts a proposal, and writes the output to the workspace. Because /docs/ is backed by Box, we can also upload that generated proposal back into Box with version history preserved.

The agent doesn’t know it’s interacting with three different storage systems. It only knows it’s interacting with a filesystem.

Filesystems as the future of agent context

If there’s one thing this pattern makes clear, it’s this: Filesystems are becoming a core abstraction for AI agents.

Not because storage is inherently exciting, but because structure is. A filesystem gives agents a predictable, navigable way to manage context across documents, memory, and generated outputs. It becomes a stable interface between reasoning and data — one that scales as workflows become more complex.

When you treat the filesystem as a contract rather than a storage mechanism, you gain real flexibility. It can be backed by relational databases, object storage, local disk, or an enterprise-grade content platform — without rewriting the agent itself.

That separation of concerns matters. It keeps agent logic focused and clean while allowing the underlying context layer to evolve independently.

If you’d like to explore this pattern yourself, you can try the full example here: https://github.com/box-community/deepagents-filesystem-example

You’ll need a Box developer account, which you can create for free here: http://account.box.com/developer/signup

As agents move from prototypes to production systems, the conversation is shifting. It’s no longer just about which model you’re using. It’s about how you structure context — and how reliably your system can reason over it.

Filesystems are emerging as one of the most practical answers to that question.