Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalization of CNN in CIFAR-10 experiments #1

Open
avinashsai opened this issue Dec 25, 2019 · 1 comment
Open

Normalization of CNN in CIFAR-10 experiments #1

avinashsai opened this issue Dec 25, 2019 · 1 comment

Comments

@avinashsai
Copy link

avinashsai commented Dec 25, 2019

Hi,
Congratulations for the amazing work. I have some doubts regarding rms normalization.

  1. Which dimensions should be considered for normalization of a CNN?? In the torch code, default axis is -1 which means Width dimension in pytorch CNN. However, in tensorflow it is channels.

  2. Can the normalization be applied on other dimensions as well?? Like in CIFAR-10 experiments. LayerNorm was applied on width and height dimensions.

Thank you.

@bzhangGo
Copy link
Owner

@avinashsai Thanks for pointing this out.

  1. The PyTorch (rmsnorm_torch) and TensorFlow (rmsnorm_tensorflow) code do NOT consider the case of CNN. By default, the code can be used for RNN, Feed-Forward and Attention networks, and the normalization is applied to the last dimension.

  2. For the normalization of CNN, I follow the LayerNorm and apply it to the width and height dimensions. Please refer to the CIFAR-10 Classification Section in README for more details.

Biao

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants