ai-rag-template is a template meant to be a based for the implementation of a RAG(retrieval augmented generation) system.
This repository contains the backend code, which consists of a web server that provides REST APIs to primarily support one type of operation:

  • Chat: Provides a conversation feature, allowing users to ask questions and get responses from the chatbot.

The backend was developed using the LangChain framework, which enables creating sequences of complex interactions using Large Language Models. The web server was implemented using the FastAPI framework.

Main Features

Chat Endpoint (/chat/completions)

The /chat/completions endpoint generates responses to user queries based on provided context and chat history. It leverages information from the configured Vector Store to formulate relevant responses, enhancing the conversational experience.


curl 'http://localhost:3000/chat/completions' \
  -H 'content-type: application/json' \
  --data-raw '{"chat_query":"Design a CRUD schema for an online store selling merchandise items","chat_history":[]}'
    "message": "For an online store selling merchandise items, we can design a CRUD schema for a `Product` entity with the following properties:\n\n- `name`: A mandatory string.\n- `description`: An optional string.\n- `price`: A mandatory number.\n\nThe CRUD schema, excluding the default attributes, would look like this:\n\n```json\n[\n  {\n    \"name\": \"name\",\n    \"type\": \"string\",\n    \"required\": true,\n    \"nullable\": false,\n    \"encryptionEnabled\": false,\n    \"encryptionSearchable\": false,\n    \"sensitivityValue\": 0\n  },\n  {\n    \"name\": \"price\",\n    \"type\": \"number\",\n    \"required\": true,\n    \"nullable\": false,\n    \"encryptionEnabled\": false,\n    \"encryptionSearchable\": false,\n    \"sensitivityValue\": 0\n  },\n  {\n    \"name\": \"description\",\n    \"type\": \"string\",\n    \"required\": false,\n    \"nullable\": false,\n    \"encryptionEnabled\": false,\n    \"encryptionSearchable\": false,\n    \"sensitivityValue\": 0\n  }\n]\n```\n\nThis schema defines the structure of the `Product` entity with the necessary properties for managing merchandise items in the online store.",
    "references": [
            "content": "### Create CRUD to Read and Write Table Data  \nTo evaluate the new page, it's essential to create a CRUD microservice and expose the relevant data through an endpoint, facilitating reading and writing operations on our table.  \n:::warning\nIf you're unfamiliar with CRUD microservices, consider consulting the [CRUD Tutorial](/console/tutorials/configure-marketplace-components/rest-api-for-crud-on-data.mdx).\n:::  \nFor our example, let's employ a basic CRUD microservice featuring a `Product` entity endowed with the subsequent properties:\n* `name`: A mandatory string.\n* `description`: An optional string.\n* `price`: A mandatory number.  \nThe data CRUD will be exposed via an endpoint named `products`.  \nBelow is the CRUD schema, excluding the default CRUD attributes (_id, creatorId, createdAt, updaterId, updatedAt, and \\_\\_STATE\\_\\_):  \n```json\n[\n{\n\"name\":\"name\",\n\"type\":\"string\",\n\"required\":true,\n\"nullable\":false,\n\"encryptionEnabled\":false,\n\"encryptionSearchable\":false,\n\"sensitivityValue\":0\n},\n{\n\"name\":\"price\",\n\"type\":\"number\",\n\"required\":true,\n\"nullable\":false,\n\"encryptionEnabled\":false,\n\"encryptionSearchable\":false,\n\"sensitivityValue\":0\n},\n{\n\"name\":\"description\",\n\"type\":\"string\",\n\"required\":false,\n\"nullable\":false,\n\"encryptionEnabled\":false,\n\"encryptionSearchable\":false,\n\"sensitivityValue\":0\n}\n]\n```\nNow, the CRUD data can be exposed using an endpoint named `products`.",
            "url": ""
            "url": ""
            "content": "### Create a CRUD for persistency  \nTo create a CRUD service you can follow [this](/console/tutorials/configure-marketplace-components/rest-api-for-crud-on-data.mdx) tutorial.\nAs data schema please import this <a download target=\"_blank\" href=\"/docs_files_to_download/flow-manager-service/saga-collection.json\">schema</a>.  \nRemember to create a **unique index** for the collection on the `sagaId` field and to set the **default state** for new documents to `PUBLIC`.  \nTo do this follow these steps:\n1. Open the _Design_ section of the Console.\n1. On the left panel, in the _Data Models_ group, click on _MongoDB CRUD_ section.\n1. Click on the CRUD you created.\n1. In the _Indexes_ section click _Add index_.\n1. Enter these values:\n- **Name**: `sagaIdIndex`\n- **Type**: `Normal`\n- **Field**: `sagaId`  \n<div style={{display: 'flex', justifyContent: 'center'}}>\n<div style={{display: 'flex', width: '600px'}}>  \n![Create CRUD index](img/create-crud-1.png)  \n</div>\n</div>  \n1. Click _Create_. The new index will be shown.\n1. Set the `unique` checkbox for the `sagaIdIndex` index.\n1. In the _Internal Endpoints_ section make sure that `Default state` is set to `PUBLIC`.  \n<div style={{display: 'flex', justifyContent: 'center'}}>\n<div style={{display: 'flex', width: '600px'}}>  \n![Create CRUD index](img/create-crud-2.png)  \n</div>\n</div>  \nYou can find more information on CRUD Persistency Manager in the [dedicated](/runtime_suite/flow-manager-service/ page.",
            "url": ""
            "url": ""

Metrics Endpoint (/-/metrics)

The /-/metrics endpoint exposes the metrics collected by Prometheus.

High Level Architecture

The following is the high-level architecture of ai-rag-template.

flowchart LR
  vs[(Vector Store)]
  llm[LLM API]
  eg[Embeddings Generator API]

  fe --1. user question +\nchat history--> be
  be --2. user question--> eg
  eg --3. embedding-->be
  be --4. similarity search-->vs
  vs --5. similar docs-->be
  be --6. user question +\nchat history +\nsimilar docs-->llm
  llm --7. bot answer--> be
  be --8. bot answer--> fe


The service requires several configuration parameters for execution. Below is an example configuration:

  "llm": {
    "name": "gpt-3.5-turbo"
  "embeddings": {
    "name": "text-embedding-3-small"
  "vectorStore": {
    "dbName": "database-test",
    "collectionName": "assistant-documents",
    "indexName": "vector_index",
    "relevanceScoreFn": "euclidean",
    "embeddingKey": "embedding",
    "textKey": "text",
    "maxDocumentsToRetrieve": 4,
    "minScoreDistance": 0.5
  "documentation": {
    "repository": {
      "baseUrl": "",
      "owner": "/mia-platform",
      "name": "/documentation",
      "baseDir": "docs",
      "supportedExtensions": [
      "requestTimeoutInSeconds": 30
    "website": {
      "baseUrl": ""
  "chain": {
    "aggregateMaxTokenNumber": 2000,
    "rag": {
      "promptsFilePath": {
        "system": "/path/to/system-prompt.txt",
        "user": "/path/to/user-prompt.txt"

Description of configuration parameters:

Param Name Description
LLM Name Name of the chat model to use. Must be supported by LangChain.
Embeddings Name Name of the encoder to use. Must be supported by LangChain.
Vector Store DB Name Name of the MongoDB database to use as a knowledge base.
Vector Store Collection Name Name of the MongoDB collection to use for storing documents and document embeddings.
Vector Store Index Name Name of the vector index to use for retrieving documents related to the user's query. Note: Currently, it's necessary to manually create this index on MongoDB Atlas.
Vector Store Relevance Score Function Name of the similarity function used for extracting similar documents using the created vector index. Note: Must be the same used to create the vector index.
Vector Store Embeddings Key Name of the field used to save the semantic encoding of documents.
Vector Store Text Key Name of the field used to save the raw document (or chunk of document).
Vector Store Max. Documents To Retrieve Maximum number of documents to retrieve from the Vector Store.
Vector Store Min. Score Distance Minimum distance beyond which retrieved documents from the Vector Store are discarded.
Documentation Repository Base Url Base path of the GitHub repository to download documentation from.
Documentation Repository Owner Owner name of the documentation repository.
Documentation Repository Name Name of the documentation repository.
Documentation Repository Base Dir. Name of the folder containing the documentation source.
Documentation Repository Request Timeout In Seconds Time limit to download a single documentation file.
Documentation Repository Supported Extensions Name of supported file extensions (currently only Markdown files).
Chain RAG System Prompts File Path ath to the file containing system prompts for the RAG model.
Chain RAG User Prompts File Path Path to the file containing user prompts for the RAG model.

Local Development

  • Before getting started, make sure you have the following information:

    • A valid connection string to connect to MongoDB Atlas
    • An OpenAI API Key to generate embeddings and contact the chat model (it's better to use two different keys)
  • Copy the sample environment variables into a file used for development and replace the placeholders with your own values. As example you can create a file called local.env from default.env with the following command:

cp default.env local.env
  • Modify the values of the environment variables in the newly created file
  • Create a configuration file located in the path defined as the CONFIGURATION_PATH value in the environment variables file. As example, you can copy the default.configuration.json file into a new file called local.configuration.json with the following command:
cp default.configuration.json local.configuration.json
  • Modify the values of the configuration in the newly created file, accordingly to the definitions included in the Configuration paragraph


  • Create a virtual environment to install project dependencies
python3 -m venv .venv
  • Activate the new virtual environment
source .venv/bin/activate
  • Install project dependencies
make install

You can run the web server with this command

# This uses the environment variable located to `local.env`
make start
# Or you can run:
dotenv -f <<YOUR_ENV_FILE>> run -- python -m

You can reivew the API using the Swagger UI exposed at http://localhost:3000/docs


To contribute to the project, please always create a branch for your updates and submit a Merge Request requesting approvals for one of the maintainers of the repository.

In order to push your commit, pre-commit operations are automatically executed to run unit tests and lint your code.

Unit tests

Ensure at any time that unit tests passes successfully. You can verify that via:

make test

Some of our tests includes snapshot, that can be updated via

make snapshot

NOTE: you might need to run make test again after updating the snapshots

Please make sure you include new tests or update the existing ones, according to the feature you are working on.


We use pylint as a linter. Please, try to follow the lint rules. You can run:

make lint

to make sure that code and tests follow our lint guidelines.

To fix any issue you can run

make lint-fix

or manually fix your code according to the errors and warning received.

Add new dependencies

You can add new dependencies, according to your needs, with the following command:

python -m pip install <<module_name>>

However, the package manager pip does not update automatically the list of dependencies included in the requirements.txt file. You have to do it by yourself with:

make freeze
# Or:
python -m pip freeze > requirements.txt

Startup with Docker

If you prefer Docker...

  • Build your image
docker build . -t ai-rag-template
  • Run the web server
docker run --env-file ./local.env -p 3000:3000 -d ai-rag-template

Try the ai-rag-template

You can also use the ai-rag-template with a CLI. Please follow the instruction in the related README file.