Skip to content

Pre-loaded LLMs served as an OpenAI-Compatible API via Docker images.

License

Notifications You must be signed in to change notification settings

ivangabriele/docker-llm

Docker Image ―
OpenAI API-Compatible Pre-loaded LLM Server

img-github img-docker

Docker images are based on Nvidia CUDA images. LLMs are pre-loaded and served via vLLM.

Environment Variables

  • TENSOR_PARALLEL_SIZE: Number of GPUs to use. Default: 1.

Port

The OpenAI API is exposed on port 8000.

Tags & Deployment Links

Note

The VRAM column is the minimum required amount of VRAM used by the model on a single GPU.

Tag Model RunPod Vast.ai VRAM
ivangabriele/llm:lmsys__vicuna-13b-v1.5-16k img-huggingface img-runpod img-vastai 26GB
ivangabriele/llm:open-orca__llongorca-13b-16k img-huggingface img-runpod img-vastai 26GB

Roadmap

  • Add more popular models.
  • Start the server in background to allow for SSH access.