New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Normalization of CNN in CIFAR-10 experiments #1

Open

avinashsai opened this issue Dec 25, 2019 · 1 comment

avinashsai commented Dec 25, 2019 •

edited

Loading

Hi,
Congratulations for the amazing work. I have some doubts regarding rms normalization.

Which dimensions should be considered for normalization of a CNN?? In the torch code, default axis is -1 which means Width dimension in pytorch CNN. However, in tensorflow it is channels.
Can the normalization be applied on other dimensions as well?? Like in CIFAR-10 experiments. LayerNorm was applied on width and height dimensions.

Thank you.

Owner

bzhangGo commented Dec 26, 2019

@avinashsai Thanks for pointing this out.

The PyTorch (rmsnorm_torch) and TensorFlow (rmsnorm_tensorflow) code do NOT consider the case of CNN. By default, the code can be used for RNN, Feed-Forward and Attention networks, and the normalization is applied to the last dimension.
For the normalization of CNN, I follow the LayerNorm and apply it to the width and height dimensions. Please refer to the CIFAR-10 Classification Section in README for more details.

Biao

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment