You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
该项目是基于论文《Cut Your Losses in Large-Vocabulary Language Models》,提出了一种新方法Cut Cross-Entropy (CCE),用于优化大词汇量语言模型的交叉熵损失计算。该方法通过只计算正确标记的logit,从而大幅减少内存需求,测试表明在小型模型中,内存消耗从24GB降至1MB,提高训练速度。
推荐人信息
如果收录这个资源,我们会在点评后面说明推荐信息人。
The text was updated successfully, but these errors were encountered:
推荐收录
链接
https://github.com/apple/ml-cross-entropy
理由
该项目是基于论文《Cut Your Losses in Large-Vocabulary Language Models》,提出了一种新方法Cut Cross-Entropy (CCE),用于优化大词汇量语言模型的交叉熵损失计算。该方法通过只计算正确标记的logit,从而大幅减少内存需求,测试表明在小型模型中,内存消耗从24GB降至1MB,提高训练速度。
推荐人信息
如果收录这个资源,我们会在点评后面说明推荐信息人。
The text was updated successfully, but these errors were encountered: