Serverless LLM Serving for Everyone.
cuda
pytorch
model-serving
model-as-a-service
huggingface-transformers
large-language-models
serverless-inference
-
Updated
Nov 23, 2024 - Python