LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
llama
cuda-kernels
deepspeed
llm
fastertransformer
llm-inference
turbomind
internlm
llama2
codellama
llama3
-
Updated
Nov 22, 2024 - Python