Get answers from your documents
AI21’s RAG Engine offers enterprises an all-in-one solution for implementing Retrieval-Augmented Generation. You can upload your documents (PDF, DOCX, HTML, or TXT) to your document library, then use the RAG engine to query those documents in natural language. The answer will be generated solely from the contents of those documents; if the information is not included in those documents, the RAG engine will say so. The base model is used solely for generating natural language text.
To see details such as supported file formats and max file sizes, see the file upload reference.
The RAG Engine comprises the following parts, each of which can be called individually:
The RAG engine works in three basic stages:
Upload your files (PDF, DOCX, HTML, and TXT) to the RAG Engine, where each account gets free storage up to 1 GB. (Want more? Contact us: sales@ai21.com)
You can also integrate your organization’s data sources, such as Google Drive, Amazon S3, and others, to automatically sync documents with RAG Engine. To enable data source integration, contact us
To upload, list, delete, and update your library, you can use either these API endpoints or use the AI21 playground web app.
In this example, we will upload three documents to the library. We show examples for both the Python SDK and direct HTTP REST request:
File labels
You can apply zero or more arbitrary text labels to each file, and later limit your list, get, search, and query requests to files matching specific labels.
File paths
Similarly, you can assign an optional, arbitrary path-like label to each file. This path-style label enables a hierarchical labeling system. For example, you might assign the path financial/USA
to some files, financial/UK
to other files, and then limit your query to financial/USA
to get US financial info, financial/UK
to get UK financial info, or financial/
to get all financial information. Path matching is simple prefix matching, and doesn’t enforce or verify the path syntax.
This can help you organize your filing system, while focusing your questions on a subset of documents.
Once you have documents in your library, you can ask questions based on document content and get an answer in natural language.
The RAG Engine searches through the document library, filtering documents by any filtering parameters that you provide, looking for relevant content. When it finds relevant content, it ingests it and then generates an answer, along with a list of sources from the library used to generate the answer.
To query files in code, use this endpoint. To query your library in your browser use the RAG Engine playground and select Ask your documents.
Let’s ask a question about working from home:
The response is:
Note that the full response returned from the model contains the sources used to generate the answer (see the sources
field).
If there is no answer
If the answer to the question is not in any of the documents, the response will have answer_in_context
set to False
and an assistant message saying “Answer not in document.”
You can do a quick text to see whether your answer was in the document by checking answer_in_context
:
When analyzing PDFs, the recommended approach is to use AI21’s native PDF support. If you are using a custom parser, ensure that your parser is accurately parsing tables and other information, as Contextual Answers can be sensitive to incorrectly parsed input data. Note that, Contextual Answers also supports .docx, .html and standard .txt files.
When analyzing tables, we recommend passing the table contents as JSONL, where each row has the key (i.e. column name) and the value (i.e. the corresponding row entry). Note that for smaller tables, or for tables embedded within a larger text, this step frequently can be skipped, as Contextual Answers will generally be able to surface answers from the raw table.
When evaluating Contextual Answers (as when evaluating any model) the evaluation process is crucial for assessing the performance. Follow these general steps to refine your evaluation methods:
Create a Test Set: Begin with 10 or more questions, each with their own contexts and “golden answers.” This helps in establishing a baseline for measuring improvements. We recommend having a diverse set of question types.
Ensure Accuracy of your Test Set: Verify that the gold answers are correct and indeed contained within the given context. While you can use answers that are sourced from other large language models, it is essential to ensure that the responses are actually accurate and correct and are contained within the context.
Comprehensive Evaluation: Evaluate not only the True Positive instances (correct answers) but also consider True Negative instances (correctly identified as “Answer not in doc”). This ensures a balanced evaluation of the model.
Evaluation Correctness of the response: Evaluation can be done either manually or automatically (for example, by using an LLM). However, if an LLM is used, care should be taken to avoid biases in evaluation, since LLMs generally prefer responses an LLM of a similar type. It is recommended that human evaluation be used either entirely or at least to evaluate the LLM classification of correctness of the Contextual Answers response.
Get answers from your documents
AI21’s RAG Engine offers enterprises an all-in-one solution for implementing Retrieval-Augmented Generation. You can upload your documents (PDF, DOCX, HTML, or TXT) to your document library, then use the RAG engine to query those documents in natural language. The answer will be generated solely from the contents of those documents; if the information is not included in those documents, the RAG engine will say so. The base model is used solely for generating natural language text.
To see details such as supported file formats and max file sizes, see the file upload reference.
The RAG Engine comprises the following parts, each of which can be called individually:
The RAG engine works in three basic stages:
Upload your files (PDF, DOCX, HTML, and TXT) to the RAG Engine, where each account gets free storage up to 1 GB. (Want more? Contact us: sales@ai21.com)
You can also integrate your organization’s data sources, such as Google Drive, Amazon S3, and others, to automatically sync documents with RAG Engine. To enable data source integration, contact us
To upload, list, delete, and update your library, you can use either these API endpoints or use the AI21 playground web app.
In this example, we will upload three documents to the library. We show examples for both the Python SDK and direct HTTP REST request:
File labels
You can apply zero or more arbitrary text labels to each file, and later limit your list, get, search, and query requests to files matching specific labels.
File paths
Similarly, you can assign an optional, arbitrary path-like label to each file. This path-style label enables a hierarchical labeling system. For example, you might assign the path financial/USA
to some files, financial/UK
to other files, and then limit your query to financial/USA
to get US financial info, financial/UK
to get UK financial info, or financial/
to get all financial information. Path matching is simple prefix matching, and doesn’t enforce or verify the path syntax.
This can help you organize your filing system, while focusing your questions on a subset of documents.
Once you have documents in your library, you can ask questions based on document content and get an answer in natural language.
The RAG Engine searches through the document library, filtering documents by any filtering parameters that you provide, looking for relevant content. When it finds relevant content, it ingests it and then generates an answer, along with a list of sources from the library used to generate the answer.
To query files in code, use this endpoint. To query your library in your browser use the RAG Engine playground and select Ask your documents.
Let’s ask a question about working from home:
The response is:
Note that the full response returned from the model contains the sources used to generate the answer (see the sources
field).
If there is no answer
If the answer to the question is not in any of the documents, the response will have answer_in_context
set to False
and an assistant message saying “Answer not in document.”
You can do a quick text to see whether your answer was in the document by checking answer_in_context
:
When analyzing PDFs, the recommended approach is to use AI21’s native PDF support. If you are using a custom parser, ensure that your parser is accurately parsing tables and other information, as Contextual Answers can be sensitive to incorrectly parsed input data. Note that, Contextual Answers also supports .docx, .html and standard .txt files.
When analyzing tables, we recommend passing the table contents as JSONL, where each row has the key (i.e. column name) and the value (i.e. the corresponding row entry). Note that for smaller tables, or for tables embedded within a larger text, this step frequently can be skipped, as Contextual Answers will generally be able to surface answers from the raw table.
When evaluating Contextual Answers (as when evaluating any model) the evaluation process is crucial for assessing the performance. Follow these general steps to refine your evaluation methods:
Create a Test Set: Begin with 10 or more questions, each with their own contexts and “golden answers.” This helps in establishing a baseline for measuring improvements. We recommend having a diverse set of question types.
Ensure Accuracy of your Test Set: Verify that the gold answers are correct and indeed contained within the given context. While you can use answers that are sourced from other large language models, it is essential to ensure that the responses are actually accurate and correct and are contained within the context.
Comprehensive Evaluation: Evaluate not only the True Positive instances (correct answers) but also consider True Negative instances (correctly identified as “Answer not in doc”). This ensures a balanced evaluation of the model.
Evaluation Correctness of the response: Evaluation can be done either manually or automatically (for example, by using an LLM). However, if an LLM is used, care should be taken to avoid biases in evaluation, since LLMs generally prefer responses an LLM of a similar type. It is recommended that human evaluation be used either entirely or at least to evaluate the LLM classification of correctness of the Contextual Answers response.