You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
原文第6页提到:
For the general distillation, we set the maximum sequence length to 128 and use English Wikipedia (2,500M words)
我从 https://github.com/google-research/bert 指定的链接下载 the latest dump
此压缩包解压后形成了一个86G的xml文件,经本工程的预处理代码总是报超磁盘空间,且每跑十几个小时就断掉,查代码以后,将pregenerate_training_date.py文件第52行self.document_shelf_filepath的路径从/cache/目录改到外部磁盘的500G文件目录,这次终于不再报超磁盘空间,但处理速度很慢,84个小时才从第367行跑到第390行。
然后最崩溃的来了!由于后面还要跑3个epoch,又跑了2天才跑完第一个epoch的5%,合着40天才能跑完一个epoch,总共3个epoch就要120天!
仅仅数据预处理就要跑这么久吗?即使跑完,后面还要上GPU训练,会不会更久???
请问原文用的是哪个数据集?是不是要用华为云平台跑才能快一些?
The text was updated successfully, but these errors were encountered:
原文第6页提到:
For the general distillation, we set the maximum sequence length to 128 and use English Wikipedia (2,500M words)
我从
https://github.com/google-research/bert 指定的链接下载
the latest dump
此压缩包解压后形成了一个86G的xml文件,经本工程的预处理代码总是报超磁盘空间,且每跑十几个小时就断掉,查代码以后,将pregenerate_training_date.py文件第52行self.document_shelf_filepath的路径从/cache/目录改到外部磁盘的500G文件目录,这次终于不再报超磁盘空间,但处理速度很慢,84个小时才从第367行跑到第390行。
然后最崩溃的来了!由于后面还要跑3个epoch,又跑了2天才跑完第一个epoch的5%,合着40天才能跑完一个epoch,总共3个epoch就要120天!
仅仅数据预处理就要跑这么久吗?即使跑完,后面还要上GPU训练,会不会更久???
请问原文用的是哪个数据集?是不是要用华为云平台跑才能快一些?
The text was updated successfully, but these errors were encountered: