An impl of adafactor as per big vision (scaling vit) changes #2299
Job | Run time |
---|---|
1m 30s | |
1h 0m 23s | |
1h 24m 1s | |
32m 5s | |
33m 55s | |
41m 37s | |
34m 28s | |
1m 48s | |
43m 14s | |
1h 5m 22s | |
32m 25s | |
30m 33s | |
39m 49s | |
32m 41s | |
1m 33s | |
42m 7s | |
1h 4m 12s | |
28m 4s | |
29m 34s | |
39m 16s | |
30m 53s | |
12h 49m 30s |