Secure RAG: Powering and safeguarding AI innovation

Generative AI has the potential to revolutionize the way industries work with their content, enable greater efficiency, and enhance decision-making processes.

Using large language models (LLMs) trained on massive datasets and code, GenAI can identify patterns, understand context, and generate outputs that feel like content generated by a human — text, images, and even videos. And it’s all made possible thanks to a game-changing AI methodology called retrieval augmented generation, or RAG.

Why is RAG important to enterprises?

With the rapid evolution of GenAI capabilities, most enterprises are contemplating strategies to integrate more AI across their operations to enhance and meet their business goals, and also as a way to stay ahead of their competition.

For enterprise use cases, content and insights generated by AI are only as good as the amount of contextual and relevant information fed to the LLM. RAG allows LLMs to look across massive amounts of information without running the risk of overwhelming the model.

Most publicly available AI models leverage RAG, but these LLMs are trained on publicly available content. Enterprises looking to leverage RAG technology for their use cases would have to make sure that the LLMs were trained on their specific data in order to generate answers that are in context and avoid hallucinations. This now poses a security challenge: enterprises need AI technology that leverages RAG in a secure manner, generating relevant content, while keeping their content secure, private, and compliant.

What does it mean for AI to be “secure”?

Imagine a scenario involving a highly confidential set of data, such as an acquisition. One of the evaluators of the deal is tasked with doing the due diligence on sensitive financial details, employee salaries, and other critical information.

In this context, AI could be extremely beneficial, providing insights, identifying strategies, assessing risks, and summarizing complex documents. However, as a result, the AI model then has access to highly confidential data (including content that the user may not be privy to). This poses a risk of information leakage.
This scenario illustrates three key security and privacy considerations that enterprises need to take into account when leveraging AI.

  1. The right permissions for user access to content: Not everyone in an organization should have access to sensitive content such as employee contracts, the company’s financial projections, etc. Users should only have access to the content they need to do their jobs.
  2. The right user access to AI: Maintain user permissions over the AI itself so data stays in the right hands. For example, you can collaborate with outside agencies while restricting their use of Box AI. Allowing all users to access AI without proper controls could lead to unauthorized data exposure or misuse. For instance, an employee might accidentally reveal sensitive customer information while generating a report. By limiting AI access to specific users and content, organizations can protect their data, maintain compliance with regulations like GDPR, and prevent potential breaches.
  3. AI access to the right content: In addition, guardrails must be built into the integration of AI to ensure the AI model itself only references content that the user querying has access to.

By implementing these measures, organizations can protect sensitive information from unauthorized access and prevent AI systems from inadvertently disclosing private data. This safeguards user privacy, maintains trust, and mitigates the potential risks associated with AI misuse.

But enabling this level of granular permission and access controls is a hard problem to solve for many enterprises. There are many use cases where AI and the knowledge base that the AI references must be those that the user has access to. Otherwise, there is a risk of revealing information that the user is not supposed to see.

In other words, enterprises need a way to trust that AI will respect the content and the user’s permissions even before the actual RAG process kicks off.

Secure, enterprise-grade intelligent content management

For enterprises to leverage RAG, a key component they need to solve for is permissions.

Enterprises need AI technology that leverages RAG in a secure and permission-integrated manner — generating relevant answers while keeping their content secure, private, and compliant.

This is where the secure RAG of Box AI comes into play.

How secure RAG addresses your biggest concerns

Box AI solves for all three of these enterprise concerns that have to do with RAG.

  1. The right user has access to the right content thanks to granular user permission checks, so users can only see and interact with the files and content they’re allowed to access. Box AI always references the permissions set on a file to ensure that the user is allowed to access that particular content.
  2. The right user has access to engage with AI via additional Box AI user configuration checks. Via the Admin Console, admins can even configure which products the user can leverage Box AI on. For example, an organization may not want temporary workers or contractors to ask AI questions, while still allowing them to see the generated content in Box Hubs.
  3. Box AI only references content that the querying user should have access to, reducing risks of data leakage caused by AI. A user can select text within a Box Note and invoke Box AI, and Box AI will only include that highlighted text in the prompt. When Box AI is invoked within a document opened in Preview, Box AI only augments the prompt with the content within the opened document. And when Box AI is invoked within Hubs, Box AI only leverages the curated content within the Hub to augment the prompt and to give the best and relevant contextual answer.

By giving the organization the power to scope which content the AI model should reference, Box ensures two critical things:

  1. There is reduced risk of AI leveraging a document or corpus of documents outside of its reference and giving irrelevant or noncompliant answers.
  2. The quality of the results produced by Box AI is much higher and more relevant, while taking into account the user’s permissions.
Box AI Model

Your content partner for secure, enterprise-grade AI

Box AI is designed to provide enterprise-grade standards, ensuring that customers can apply the latest AI capabilities to their content while maintaining protection and security. One of the primary ways we do this is via secure RAG. This not only gives full access control of AI and content over to the enterprise, it also allows flexibility to adapt to certain use cases.

Data security and the safe use of AI are critical components of Box’s offerings. To reinforce this commitment, Box has published and adheres to a set of AI principles. These principles guide our development and deployment of AI technologies, ensuring that they align with our values and the stringent security requirements of our customers.

See how Box empowers enterprises with trusted AI

Box does not retain Box AI prompts and resulting outputs without explicit customer consent, nor do we allow our AI model service partners to do so. Once an answer is returned from our AI model provider, that information is deleted from the provider's system. Once a document or application is closed in Box, all question-and-answer information is deleted from Box AI.

With Box AI, enterprises can confidently use advanced AI tools, knowing that their data is handled with the utmost care and integrity. Learn more.

Dive deeper in how Box AI works by taking a look at this white paper.

Free 14-day trial.
No risk.

Box free trial includes native e‑signatures, lets you securely manage, share and access your content from anywhere.

Try for free