Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

想问下tinybert Task-specific Distillation第一步中间层蒸馏的评价指标 #222

Open
wsh2836741 opened this issue Nov 11, 2022 · 2 comments

Comments

@wsh2836741
Copy link

感觉团队杰出的工作。关于Task-specific Distillation第一步中间层蒸馏,比如我是分类任务,由于中间层蒸馏不会训练最后的分类层参数,所以想问下第一步中间层蒸馏的评价指标是什么?还是说不需要关注评价指标,只看loss下降,模型收敛就可以?非常感谢!

@charliezjw
Copy link

在我的理解中loss只会back propagate,前L layer的loss并不会update L+1 layer的weight

@wsh2836741
Copy link
Author

@charliezjw 嗯嗯我也是这么理解,所以那么Task-specific Distillation第一步中间层蒸馏训练时是不是不需要关注评价指标,只看loss下降就可以?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants