Training runs

Overview

Run #	Reference	Summary	Currently Active	Net Numbers	Best nets
NA	Old Main	Original 192x15 "main" run	No	1 to 601	ID595
test10	Lc0 Transition	Original 256x20 test run	No	10'000 to 11'262	11250 11248
test20	Training run reset	Many changes, see blog.	Yes
test30	TB rescoring	Experiment with network initialization strategy, trying to solve spike issues. Experiment with Tablebase rescoring	Yes

Most data from this sheet

Alpha Zero reference paper
Use best guess for games length and assuming resign cuts game length by 30%
Old Main
Initially new networks generated based on fixed timing rather than on games

Item	A0 with resign	A0 w/out resign	Main up to ID xxx	Main from ID xxx	Main from IDyyy to ID598	Test 10	Test 20
Positions per training game	95	135	135	135	135	135	-----------
New networks per day	-----------		6	6
Training Games per day	-----------		160,000	160,000
Training Games per network	-----------		26,700	26,700	40,000	40,000
Total training games	44,000,000	44,000,000			25,000,000
Positions generated per day	-----------	-------------	21,600,000	21,600,000
Positions generated per network	-----------	-------------	3,600,000	3,600,000	5,400,000	5,400,000
Total positions generated	4.158 B	5.940 B
Batch size	4,096	4,096	1,024	256	256	2,048
Training steps per day	-----------	-------------	300,000	300,000
Training steps per network	-----------	-------------	50,000	50,000	10,000	2,500
Total training steps	700,000	700,000
Positions trained per day	-----------	-------------	307,200,000	76,800,000
Positions trained per network	-----------	-------------	51,200,000	12,800,000	2,560,000	5,120,000
Total position trained	2.867 B	2.867 B
Sampling ratio	0.69	0.48	14.22	3.55	0.47	0.95	0.89