Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.
nlp
bloom
pipeline
pytorch
deepspeed
llm
full-finetune
model-parallization
flash-attention
llama2
baichuan2-7b
chatglm3-6b
mixtral-8x7b
-
Updated
Feb 5, 2024 - Python