features

same features as ryan-blunden/django-chatgpt-clone
safer client conversations handling (not using window.localStorage)
conversations stored in database, per user
added R.A.G. mode using LlamaIndex
document preview + highlighting
"dynamic applications" (DynApps): upload a set of documents
connection to MS Graph API for live sharepoint ingestion
sequential ingestion (repetitive), with recovery
parametrization of RAG through settings: chat behavior, query type (summarize, etc)
handling of OpenAI, Azure OpenAI and local LLM
handling of multiple indexing configurations (loading params, chunking params, index type, embeddings, language, etc.)
handling of different vector stores: simple (default), qdrant, pgvector, chroma, elastic search
handling of multiple tenants+applications from the same code base
possibility of mutualizing the embeddings when creating indexes accross multiple stores
search feature (powered by Elasticsearch)
ASGI + asyncio implementation (as well as WSGI + sync)
full offline mode with local LLM/embeddings and local static files
TODO: different levels of security and sharding per tenants

install

docker-compose

Make sure:

HOST environment variable is set in the .env file.
HOST value also needs to match the ./.certs/HOST/ path that needs to be present for NGinx to work.
logs, data and storage directories exist
apps.json and llms.json in chat fixtures, users.json in users fixtures and .env file ready

# install docker
https://docs.docker.com/engine/install/ubuntu/#install-using-the-repository

# generate localhost certificates if you want to test locally
./make_certs_localhost.sh

# run
docker-compose up -d
# or (using the plugin)
docker compose up -d

# build and run
docker-compose up -d --build

# if issues: force rebuild with debug output
docker-compose build --progress=plain --no-cache

# zoom-in
docker-compose run --rm -it -django_app bash

locally

applications / env

nix-shell # if not using nix-shell you will need openssl library, libjpeg , zlib and docker (if using qdrant)
python3.11 -m venv .venv
source .venv/bin/activate
pip install -r requirements/local.txt

Init the *.json and portal image

cp src/chat/fixtures/apps_default.json src/chat/fixtures/apps.json 
cp src/chat/fixtures/llms_default.json src/chat/fixtures/llms.json 
cp src/chat/fixtures/tenants_default.json src/chat/fixtures/tenants.json 
cp src/users/fixtures/users_default.json src/users/fixtures/users.json 
curl http://...... --output src/llm_portal/static/images/portal_image.svg

# edit the files accordingly to your liking ....

init data

python manage.py migrate
python manage.py inittenants
python manage.py initusers
python manage.py initapps
python manage.py initllms

set config

cp sample.env .env

... edit the .env config ... (api keys, system prompts, options, etc.)

Index the data

Edit the ./idx files to your liking:

cp idx/rag.idx.sample idx/rag.idx
cp idx/ragdyn.idx.sample idx/ragdyn.idx
# use the no cache config to index ...
DJANGO_SETTINGS_MODULE=llm_portal.settings.production_no_cache python manage.py index YOUR_APP

Build the Tailwind files / deploy etc.

./reset_deploy.sh

run server + indexing + cleaning process for dynapps & files

DynApps indexing process

python manage index_dynapps --use_loop

DynApps and Files cleaning process

python manage clean_all --use_loop

Sync

python manage.py runserver

gunicorn -w 8 src.llm_portal.wsgi

Test sync concurrency with: gunicorn -w 1 src.llm_portal.wsgi (this should actually NOT be concurrent)

Important Note: currently the streaming response is blocking and - if you only have one worker registered with the server - it will block the other requests until the streaming response is finished !

ASync

python -m uvicorn --loop asyncio src.llm_portal.asgi:application

Test async concurrency with: python -m uvicorn --workers 1 --loop asyncio src.llm_portal.asgi:application

The async URLs are /async/chat and /async/files

Important Note: if you want to run the portal with the async methods, you should use uvicorn instead of daphne. The streaming is choppy with daphne but not with uvicorn...

vector stores

For more information about the vector stores, please see here

Using NGinx

When using NGinx, it is good to use the following configuration:

server {
    listen 443 ssl;

    server_name YOUR_DOMAIN;
    index index.php index.html index.htm;

    ssl_certificate      PATH_TO_FULLCHAIN_PEM;
    ssl_certificate_key  PATH_TO_PRIVKEY_PEM;

    gzip on;
    gzip_disable "msie6";

    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_buffers 16 8k;
    gzip_http_version 1.1;
    gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript application/pdf;

    location / {
        proxy_pass YOUR_UVICORN_HOST_PORT;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
	    proxy_set_header X-Forwarded-Proto https;
        proxy_set_header Host $http_host;
	    proxy_buffering off;
        proxy_redirect off;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }    error_page 404 /404.html;
    error_page 500 502 503 504 /50x.html;
    
    location = /50x.html {
        root /usr/share/nginx/html;
    }
}

Testing

Make sure production docker containers are running
Start PGBouncer (if using llm_portal.settings.loadtesting)

cd conf/pgbouncer
pgbouncer pgbouncer.simple.ini

init users, appswithout users.conf/apps.conf but users_default.conf/apps_default.conf:

DJANGO_SETTINGS_MODULE=llm_portal.settings.loadtesting python manage.py initusers --use_default
# for v2+
DJANGO_SETTINGS_MODULE=llm_portal.settings.loadtesting python manage.py initapps --use_default
# And init the LLMs now (for v2)
DJANGO_SETTINGS_MODULE=llm_portal.settings.loadtesting python manage.py initllms

Index the 'testing' application

DJANGO_SETTINGS_MODULE=llm_portal.settings.loadtesting python manage.py index testing
# or, older way (v1)
DJANGO_SETTINGS_MODULE=llm_portal.settings.loadtesting APP_NAME=testing python manage.py

Note: you need to have some data in the data/testing directory ...

Run server

for v1 Make sure the .env settings are connecting to pgbouncer and not the databases directly. You can have the same user/passwords in the userlist.txt for pgbouncer so this avoids the need to rewrite the user/password in the .env file everytime you want to switch back to direct connections.

...
PGVECTOR_HOST="127.0.0.1"
PGVECTOR_DB="db_bouncer_pgvector" # HERE
PGVECTOR_PORT="6432" # HERE
PGVECTOR_USER="pgvectoruser"
PGVECTOR_PASSWORD="pgvectorpassword"
PGVECTOR_TEXT_SEARCH_CONFIG="english"
...
DB_ENGINE="django.db.backends.postgresql"
DB_HOST="localhost"
DB_NAME="db_bouncer_pgsql" # HERE
DB_PORT="6432" # HER 
DB_USER="pgsql"
DB_PASSWORD="pgsqlpassword"
...

# with uvicorn (async)
DJANGO_SETTINGS_MODULE=llm_portal.settings.loadtesting APP_NAME=testing python -m  uvicorn --workers 4 --port 8000 --host 0.0.0.0 --loop asyncio src.llm_portal.asgi:application

# with gunicorn (sync)
DJANGO_SETTINGS_MODULE=llm_portal.settings.loadtesting APP_NAME=testing gunicorn -w 4 -b :8000 --timeout 320 src.llm_portal.wsgi

Functionnal testing

DJANGO_SETTINGS_MODULE=llm_portal.settings.testing python manage.py test

Load testing with Locus

locust -f test/load/locustfile.py

Note 1: Make sure that the variables LOCUST_*** are correct in the .env file.

Note 2: If the client can't login and you see this error in the Locust debug: "gaierror(8, 'nodename nor servname provided, or not known')", you may need to add your domain to the /etc/hosts file. For example, for http://testing.localhost:8000/, add 127.0.0.1 testing.localhost

Note 3: Do NOT add a trailing slash at the end of the server URL

Notes (from older project ... to be updated)

PostGreSQL:

Ubuntu: /etc/postgresql/15/main/postgresql.conf
set max db connections in PostGreSQL to 100
set work_mem to 96M maybe?
Dont forget to make sure this is set: ALTER ROLE DB_USER SET client_encoding TO 'utf8'; ALTER ROLE DB_USER SET default_transaction_isolation TO 'read committed'; ALTER ROLE DB_USER SET timezone TO 'UTC';

For PostGres 15: ALTER DATABASE DB_NAME OWNER TO DB_USER;
TUNER FOR PostGreSQL: https://pgtune.leopard.in.ua/#/

PGBOUNCER (new terminal window):

set max_client_conn in pgbouncer to 10000 =>>
set ulimit to 10212 !! (suggested by pgbouncer at startup) ulimit -n 10212
pgbouncer pgbouncer.simple.ini -q

Memcached (2GB memory allocated + 5000 concurrent connections) (new terminal window):

ulimit -n 10212
memcached -m 2048 -c 5000 -vvv start or
memcached -m 2048 -c 5000 -s ../3rdparty/memcached/memcached.sock -vvv -a 0770 start and change settings.py so that location is ./3rdparty/memcached.sock file

Django (new terminal window)

python makemigrations
python migrate issue with PostGreSql 15: https://stackoverflow.com/questions/74110708/postgres-15-permission-denied-for-schema-public (SCHEMA PUBLIC etc...)
python manage.py initdata_massive
'CONN_MAX_AGE' : 0 in DB settings
ulimit -n 10212 (before starting server too) (for files)
launchctl limit maxproc 2000 2048 (for processes/threads)
ulimit -u 1000 (for processes/threads) (note: sometimes those changes wont be accepted: start a new window and try again?)
source .venv/bin/activate
DJANGO_SETTINGS_MODULE=t3mp3st_api.settings.loadtesting; python manage.py runserver or
DJANGO_SETTINGS_MODULE=t3mp3st_api.settings.loadtesting; gunicorn --workers=8 --threads=2 --worker-connections=3000 t3mp3st_api.wsgi:application or
DJANGO_SETTINGS_MODULE=t3mp3st_api.settings.loadtesting; uwsgi --http :8000 --wsgi-file ./t3mp3st_api/wsgi.py --master --processes 4 or
DJANGO_SETTINGS_MODULE=t3mp3st_api.settings.loadtesting; daphne -b 0.0.0.0 -p 8001 t3mp3st_api.asgi:application or
DJANGO_SETTINGS_MODULE=t3mp3st_api.settings.loadtesting; uvicorn --workers=5 --lifespan off t3mp3st_api.asgi:application or
DJANGO_SETTINGS_MODULE=t3mp3st_api.settings.loadtesting; uvicorn --workers=9 --lifespan off --host 0.0.0.0 t3mp3st_api.asgi:application

If uvicorn runs on AWS/EC2, use --host 0.0.0.0 option

Locust (new terminal window):

source .venv/bin/activate
set ulimit too !! (suggested by Locust) (or it wont be able to access host file and we get "gaierror(8, 'nodename nor servname provided, or not known')") ulimit -n 10000
locust -f ../test/load/locustfile.py

Run test:

RUN THE TEST @ http://0.0.0.0:8089 for 5000 users !!!

LOAD TEST ISSUES:

USE loadtesting SETTINGS to avoid more "too many open files" issues !! (based on production server)
too many open files >> ulimit -n 10212 (for django, memcached, locust and pgbouncer)
too many DB connections >> use pgbouncer
too many threads:

USING THE ASYNC SERVER WITH DAPHNE: DJANGO_SETTINGS_MODULE=t3mp3st_api.settings.loadtesting; daphne -b 0.0.0.0 -p 8001 t3mp3st_api.asgi:application DJANGO_SETTINGS_MODULE=t3mp3st_api.settings.loadtesting; daphne -b 0.0.0.0 -p 8001 t3mp3st_api.asgi:application USING THE ASYNC SERVER WITH UVICORN: DJANGO_SETTINGS_MODULE=t3mp3st_api.settings.loadtesting; uvicorn --workers=9 --lifespan off t3mp3st_api.asgi:application

ON local machine: memcached -m 2048 -c 5000 -s ../3rdparty/memcached/memcached.sock -a 0770 start pgbouncer pgbouncer.simple.ini -q

sudo -u postgres /Library/PostgreSQL/11/bin/pg_ctl -D /Library/PostgreSQL/11/data restart

MORE ABOUT SYSTEM limits: https://unix.stackexchange.com/questions/108174/how-to-persistently-control-maximum-system-resource-consumption-on-mac

Name		Name	Last commit message	Last commit date
Latest commit History 787 Commits
.vscode		.vscode
conf		conf
idx		idx
locale		locale
logs		logs
requirements		requirements
src		src
test		test
utility		utility
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
NOTES.md		NOTES.md
README.md		README.md
TODO.md		TODO.md
VECTOR_STORES.md		VECTOR_STORES.md
decode-jwt.sh		decode-jwt.sh
devcontainer.json		devcontainer.json
docker-compose.yml		docker-compose.yml
docker_run.sh		docker_run.sh
graphapi.py		graphapi.py
make_certs_localhost.sh		make_certs_localhost.sh
manage.py		manage.py
pgvector.nix		pgvector.nix
print.py		print.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
reset.sh		reset.sh
reset_db_postgres.sh		reset_db_postgres.sh
reset_deploy.sh		reset_deploy.sh
reset_index.sh		reset_index.sh
sample.config.json		sample.config.json
sample.env		sample.env
shell.nix		shell.nix
shell.nix1.working_too		shell.nix1.working_too
shell.nix2		shell.nix2
shell.nix3		shell.nix3
unzip.py		unzip.py

License

benbenz/django-llm-portal

Folders and files

Latest commit

History

Repository files navigation