Skip to content
MaurizioDeLeo edited this page Oct 11, 2018 · 29 revisions

Overview

Run # Reference Summary Currently Active Net Numbers Best nets
NA Old Main Original 192x15 "main" run No 1 to 601 ID595
test10 Lc0 Transition Original 256x20 test run No 10'000 to 11'262 11250 11248
test20 Training run reset Many changes, see blog. Yes
test30 TB rescoring Experiment with network initialization strategy, trying to solve spike issues. Experiment with Tablebase rescoring Yes

WORK IN PROGRESS

LR Drop

Training Run 1st LR drop Elo 2nd LR drop Elo 3rd LR drop Elo Best Net Elo
Old Main
Test 10
Test 20 ID 20247 2318
Test 30

Sampling ratio

Most data from this sheet

  • Alpha Zero reference paper
    Use best guess for games length and assuming resign cuts game length by 30%
  • Old Main
    Initially new networks generated based on fixed timing rather than on games
Item A0 with resign A0 w/out resign Main up to ID xxx Main from ID xxx Main from IDyyy to ID598 Test 10 Test 20
Positions per training game 95 135 135 135 135 135 -----------
New networks per day ----------- 6 6
Training Games per day ----------- 160,000 160,000
Training Games per network ----------- 26,700 26,700 40,000 40,000
Total training games 44,000,000 44,000,000 25,000,000
Positions generated per day ----------- ------------- 21,600,000 21,600,000
Positions generated per network ----------- ------------- 3,600,000 3,600,000 5,400,000 5,400,000
Total positions generated 4.158 B 5.940 B
Batch size 4,096 4,096 1,024 256 256 2,048
Training steps per day ----------- ------------- 300,000 300,000
Training steps per network ----------- ------------- 50,000 50,000 10,000 2,500
Total training steps 700,000 700,000
Positions trained per day ----------- ------------- 307,200,000 76,800,000
Positions trained per network ----------- ------------- 51,200,000 12,800,000 2,560,000 5,120,000
Total position trained 2.867 B 2.867 B
Sampling ratio 0.69 0.48 14.22 3.55 0.47 0.95 0.89
Clone this wiki locally