Retrieval-Augmented Generation (RAG) is an innovative method that boosts the capabilities of Large Language Models (LLMs) by integrating them with external data sources. This technique involves retrieving relevant data or documents related to a specific query or task, providing the LLM with additional context. As a result, organizations can use their internal documents and improve the relevance of the LLM's output significantly, and overcome one of the biggest challenges with LLMs today - hallucinations.
RAG allows LLMs, which are trained on vast datasets and have billions of parameters, to access up-to-date or specialized knowledge without the need to retrain or fine-tune the model. It is particularly valuable for applications such as support chatbots and Q&A systems that require timely and accurate information. By merging the strengths of LLMs with the reliability of external knowledge bases, RAG ensures that generated responses are not only fast and coherent but also grounded in authoritative and current data.
There are three common steps to create a RAG based architecture:
RAG is a fantastic solution for building LLM-based applications using our own data without the need for training or fine-tuning. However, there's a significant drawback that could hinder its use in organizations or applications — RAG does not natively support access control, and implementing it on most vector databases is not straightforward.
When indexing documents in a vector database without storing additional metadata, any user query will be compared against all vectors in the database. The most relevant documents will then be retrieved and used to generate an answer.
This means that if a user asks a question about a topic they shouldn't have access to, they could still receive an answer if the relevant data exists in the database, even without any injection or bypass techniques.
Today, there are two main strategies to secure access and permissions while building RAG:
One way to ensure secure access is by creating separate instances for different data types or user roles. For example:
This approach allows the application to direct queries to the appropriate instance based on the user's role, ensuring that sensitive data remains protected within its designated instance.
However this approach introduced new problems:
Another method involves adding metadata attributes to each document during indexing, specifying the roles or users authorized to access that document. When a user queries the RAG system, the search is limited to documents they have permission to access, based on their role or user ID.
This method as well has it’s own drawbacks:
👉 While both approaches aim to enhance data security, they both come with their own set of challenges that need to be carefully managed.
Not really.
In many organizations, it's common to have files on shared drives or storage systems with "Everyone" permissions or other broad access roles that aren't ideal. This isn't a new problem introduced by LLMs and RAGs, but these systems can amplify the issue, making it even more challenging to manage.
If a file with "Everyone" permission is buried deep within a shared drive, and a user isn't aware of its existence, the risk of them accessing it is relatively low. However, with RAG systems, a user could mistakenly access unauthorized content simply by asking a question. If the answer lies in one of these broadly accessible files, the system could retrieve and present this information, bypassing traditional knowledge of file locations.
To address this issue, we need to implement an additional layer of security:
While traditional access control mechanisms help manage document and data access based on permissions, in the unpredictable world of LLMs and the evolving architecture of RAG, a new solution is required to oversee the actual data being requested and received.
Enter Context**-Based Access Control (CBAC)**, the latest innovative feature from Lasso Security.
CBAC introduces a new perspective to the world of LLMs and RAG by focusing on the context of both the request and the response and comparing it to a few parameters relevant to the user expected behavior. This approach goes beyond structured data, to understand the nuances of context and ensure secure data handling in many challenging use-cases of the GenAI world. This new feature provides great granularity for admins and help them ensure safe usage of RAG without the overhead of building, maintaining and updating multiple systems infinitely.
CBAC addresses these questions by providing a context-based approach to data access. It enables organizations to:
By implementing CBAC on top of the previous alternatives (separate instances and document-level access control), organizations can elevate their security and control, and start using RAG as part of their enterprise stack without the overhead and fallbacks of each method.
This groundbreaking approach redefines data protection, guaranteeing that only the right information reaches the right users at exactly the right moment, setting a new standard for access management and permissions.
For more information on how Lasso Security can help you fortify your RAG and LLM, schedule a call with our team.