Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AlexNet bacward shape missmatch + ReLu return a tuple #681

Open
Belegkarnil opened this issue Apr 20, 2020 · 7 comments
Open

AlexNet bacward shape missmatch + ReLu return a tuple #681

Belegkarnil opened this issue Apr 20, 2020 · 7 comments

Comments

@Belegkarnil
Copy link

Hi,

I have implemented AlexNet in singa but I obtain an error during the backward_and_update instruction. I am using Singa 3.0.0.rc1 on cpu.

This is my AlexNet implementation:
`from singa import autograd
from singa import module
from singa import opt

all = ['AlexNet', 'alexnet']

class AlexNet(module.Module):
def init(self, num_classes=1000):
super(AlexNet, self).init()
# 12 sur GPU donc 6 & 6
self.features1 = [
autograd.Conv2d(3,64,kernel_size=11,stride=4,padding=2),
autograd.ReLU(),
autograd.MaxPool2d(kernel_size=3, stride=2),
autograd.Conv2d(64,192,kernel_size=5,padding=2),
autograd.ReLU(),
autograd.MaxPool2d(kernel_size=3, stride=2),
autograd.Conv2d(192,384,kernel_size=3,padding=1),
autograd.ReLU(),
autograd.Conv2d(384, 256,kernel_size=3,padding=1),
autograd.ReLU()
]
self.features2 = [
autograd.Conv2d(256, 256,kernel_size=3,padding=1),
autograd.ReLU(),
autograd.MaxPool2d(kernel_size=3, stride=2)
]
self.avgpool = autograd.AvgPool2d(6, stride=1)
self.flatten = autograd.Flatten()
self.classifier = [
autograd.Dropout(),
autograd.Linear(256 * 6 * 6, 4096),
autograd.ReLU(),
autograd.Dropout(),
autograd.Linear(4096, 4096),
autograd.ReLU(),
autograd.Linear(4096, num_classes)
]
self.optimizer = opt.SGD(lr=0.001, momentum=0.9)
def loss(self, out, ty):
return autograd.softmax_cross_entropy(out, ty)
def optim(self, loss, dist_option, spars):
if dist_option == 'fp32':
self.optimizer.backward_and_update(loss)
elif dist_option == 'fp16':
self.optimizer.backward_and_update_half(loss)
elif dist_option == 'partialUpdate':
self.optimizer.backward_and_partial_update(loss)
elif dist_option == 'sparseTopK':
self.optimizer.backward_and_sparse_update(loss, topK=True, spars=spars)
elif dist_option == 'sparseThreshold':
self.optimizer.backward_and_sparse_update(loss, topK=False, spars=spars)
def forward(self, x):
for (i,layers) in enumerate([self.features1, self.features2, [ self.avgpool,self.flatten ] , self.classifier]):
for (j,fn) in enumerate(layers):
x = fn(x)
if(type(x) is tuple):# FIXME I have to do that because of a bug in Singa? (ReLU)
x = x[0]
return x

def alexnet(**kwargs):
return AlexNet(**kwargs)
`
And I get : AssertionError: ('shape mismatch', (9216, 4096), (256, 4096))
Which is my first linear layer : 256 * 6 * 6, 4096

When I use my VGG16 implementation, I got a similar error :
AssertionError: ('shape mismatch', (25088, 4096), (512, 4096))

It seems that the backward operation does not map the correct shape to the corresponding layer.

Moreover, the ReLu class return a 1-tuple containing a Tensor. Is it intended or is it a bug?

@dcslin
Copy link
Member

dcslin commented Apr 22, 2020

Hi, as pointed out by @chrishkchris , the convention is to use RELU as stateless layer.
usage:
https://github.com/apache/singa/blob/master/examples/cnn/model/cnn.py#L40

For shape mismatch, you might need to check the shape of layers again. Let me know if further info is required.

@Belegkarnil
Copy link
Author

Ok, I'll try but why to provide a statefull ReLU Layer? Is it for a specific purpose?

@Belegkarnil
Copy link
Author

I compared my implementation to other frameworks and it is the same shapes.
Moreover the forward pass does not cause any issue, it is the backward pass.
This is why I suspect a bug. Is it possible?

@nudles
Copy link
Member

nudles commented Apr 23, 2020

Hi, as pointed out by @chrishkchris , the convention is to use RELU as stateless layer.
usage:
https://github.com/apache/singa/blob/master/examples/cnn/model/cnn.py#L40

For shape mismatch, you might need to check the shape of layers again. Let me know if further info is required.

@dcslin Did you try to run the code pasted by @Belegkarnil ?
Can you reproduce the error?

@dcslin
Copy link
Member

dcslin commented Apr 24, 2020

Hi, as pointed out by @chrishkchris , the convention is to use RELU as stateless layer.
usage:
https://github.com/apache/singa/blob/master/examples/cnn/model/cnn.py#L40
For shape mismatch, you might need to check the shape of layers again. Let me know if further info is required.

@dcslin Did you try to run the code pasted by @Belegkarnil ?
Can you reproduce the error?

I am still checking the code

@dcslin
Copy link
Member

dcslin commented Apr 29, 2020

Hi @Belegkarnil, you might need to change 256 * 6 * 6, 4096 to 256, 4096 to make it works.

Also you are recommended to use relu/dropout/flatten like this https://github.com/apache/singa/blob/master/examples/cnn/model/cnn.py#L40

@Belegkarnil
Copy link
Author

Ok thanks a lot ! I assumed that it works like other frameworks but that the result of AvgPool has a different shape.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants