Skip to content

Latest commit

 

History

History
31 lines (23 loc) · 2.67 KB

benchmark.md

File metadata and controls

31 lines (23 loc) · 2.67 KB

Speed BenchMark

Huggingface

  • internlm-7b
seq_len node zero stage mirco bs global bs time (s/iter) token /s mem (GiB) checkpoint layers
2048 1 1 1 512 31.3 4187 68G 0
4096 1 1 1 512 60.9 4304 71G 0
8192 1 2 1 512 146.2 3586 79G 5
  • qwen14b
seq_len node zero stage mirco bs global bs time (s/iter) token /s mem (GiB) checkpoint layers
2048 1 2 1 512 62.8 2087 80G 0
4096 1 2 1 512 141.9 1847 80G 30
8192 1 2 1 512 314.5 1667 80G all

Megatron

  • 70b llama-2
seq_len gpu_num pp method have_checkpoint_layers mirco bs global bs time (s/iter) token /s mem (GiB)
2048 TP 4 PP 8 DP 1 parameters none 1 512 95.5 343 66G
4096 TP 4 PP 8 DP 1 parameters none 1 512 172.3 380 80G
8192 TP 4 PP 8 DP 8 parameters 3 * pp 1 1024 96 341 80G
32k TP8 PP8 DP2 parameters all 1 512 607 216 78G
64k TP8 PP8 DP2 parameters all 1 512 2377 110 80G