You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am a beginner in deep learning and I would like to know if the reason for the gradient to be 0 is due to the vanishing gradient or if my data is too small (batch_size=32)。
I tried to add Lora to a three-layer neural network, but the result was that only the gradients of the Lora_a and Lora_b matrices in the last layer were below 1e-2, while the gradients of the other layers were all 0.
My definition of lora. linear is as follows:
I am a beginner in deep learning and I would like to know if the reason for the gradient to be 0 is due to the vanishing gradient or if my data is too small (batch_size=32)。
I tried to add Lora to a three-layer neural network, but the result was that only the gradients of the Lora_a and Lora_b matrices in the last layer were below 1e-2, while the gradients of the other layers were all 0.
My definition of lora. linear is as follows:
self.prednet_full1_lora = lora.Linear(self.prednet_input_len,self.prednet_len1,r=4)
self.prednet_full2_lora = lora.Linear(self.prednet_len1, self.prednet_len2, r=4)
self.prednet_full3_lora = lora.Linear(self.prednet_len2, 1,r=4)
The forward part of the model is shown below (assuming input_x is the input):
input_x = torch.sigmoid(self.prednet_full1_lora.forward(input_x))
input_x = torch.sigmoid(self.prednet_full2_lora.forward(input_x))
output = torch.sigmoid(self.prednet_full3_lora.forward(input_x))
and I don't forget to write :
loss.backward()
optimizer.step()
net.apply_clipper()
I would greatly appreciate it if you could provide some ideas or solutions
The text was updated successfully, but these errors were encountered: