Skip to content

Latest commit

 

History

History
43 lines (29 loc) · 2.62 KB

README.md

File metadata and controls

43 lines (29 loc) · 2.62 KB

Docker Image ―
OpenAI API-Compatible Pre-loaded LLM Server

img-github img-docker

Docker images are based on Nvidia CUDA images. LLMs are pre-loaded and served via vLLM.

Environment Variables

  • TENSOR_PARALLEL_SIZE: Number of GPUs to use. Default: 1.

Port

The OpenAI API is exposed on port 8000.

Tags & Deployment Links

Note

The VRAM column is the minimum required amount of VRAM used by the model on a single GPU.

Tag Model RunPod Vast.ai VRAM
ivangabriele/llm:lmsys__vicuna-13b-v1.5-16k img-huggingface img-runpod img-vastai 26GB
ivangabriele/llm:open-orca__llongorca-13b-16k img-huggingface img-runpod img-vastai 26GB

Roadmap

  • Add more popular models.
  • Start the server in background to allow for SSH access.