loss does not converge #125

DLC-jjj · 2022-11-05T14:43:21Z

Hi, i have problem in training process of Pytorch version.I made no changes to the project and used the original BIPEDv2 dataset for training, and the parameters used the default parameters of the project. After training for 17 epochs, the loss barely changes. In the end, it can predict the image, but the effect is not very good. What could be the reason? Looking forward to your reply.

xavysp · 2022-11-06T17:14:19Z

Hi can I see the tensorboad graph?

LvGuangzu · 2022-11-07T08:57:21Z

Have you solved this problem? I also have this problem, looking forward to your reply, if it is not solved we can communicate a bit.

DLC-jjj · 2022-11-07T13:08:43Z

Have you solved this problem? I also have this problem, looking forward to your reply, if it is not solved we can communicate a bit.

I'm very sorry that I haven't solved it yet. I asked one of my classmates to try this project and encountered the same problem, which has not been solved yet.

DLC-jjj · 2022-11-07T13:16:20Z

Hi can I see the tensorboad graph?

Very sorry I didn't have tensorboard installed. The loss fluctuates from 1.5 to 3.5 from the first epoch to the end, so there is nothing wrong with the loss curve. I asked one of my classmates to try this project and encountered the same problem, and the following person also encountered the same problem. I'm a little troubled.

xavysp · 2022-11-08T06:18:28Z

Well if you used dexined without changing the tensorboad part, maybe you have the data, you just need to see the graph. We need to see if there is an improvement by epochs. The labels In edge detection are very sensitive, in some training samples may be detected more edges than in the GT and you'll find diferen loss value but that does not mean dexined is not training. You should check the average loss of each epoch. It happen in DL based edge detectors.

xavysp · 2022-11-08T06:20:42Z

Hi can I see the tensorboad graph?

Very sorry I didn't have tensorboard installed. The loss fluctuates from 1.5 to 3.5 from the first epoch to the end, so there is nothing wrong with the loss curve. I asked one of my classmates to try this project and encountered the same problem, and the following person also encountered the same problem. I'm a little troubled.

How many training data do you have?

LvGuangzu · 2022-11-08T06:24:22Z

I used the data set BIPED in the project in the training process, and the training set contains 200 pictures. My loss dropped to 0.9 in the 34th round, but it rose to 2.1 in the 35th round, and the overall loss was around 2.

LvGuangzu · 2022-11-08T06:26:12Z

Then I trained it again using the 0.9 pth file as my pretrained model and found that he started to oscillate around 1.7.

LvGuangzu · 2022-11-08T06:31:00Z

I read that your experiment in the paper was performed 150K times. I don't know if it is because I trained too few times. I trained for 100 rounds and found that there was no improvement.

DLC-jjj · 2022-11-08T08:32:31Z

嗨，我可以看到张量板图吗？

非常抱歉我没有张张。从第一个纪元结束到的，安装损失在 1.5 到 5 之间的波动量板，所以让项目失败没有任何问题。下面的人也遇到了同样的问题。我有点烦恼。

你有多少训练数据？

Thank you for your reply, I am using BIPEDv2 data, 200 training images. The training loss is similar to the loss function curve of the previous layer reply, and there is no tendency to converge.

xavysp · 2022-11-08T12:12:01Z

I used the data set BIPED in the project in the training process, and the training set contains 200 pictures. My loss dropped to 0.9 in the 34th round, but it rose to 2.1 in the 35th round, and the overall loss was around 2.

Did you change some hyperparameters? May you check with my lightweight model?
LDC: Lightweight Dense CNN for Edge Detection
I cannot find any error, just to make sure that the problem is not the data.

LvGuangzu · 2022-11-08T13:08:01Z

Yes, I changed some hyperparameters. I didn't change any hyperparameters on the first run, but found that the loss did not converge, I think it was a problem that the learning rate dropped too quickly, and then I modified the hyperparameters. First, is_testing=False; Then I modified the learning rate to drop 10x every 20 rounds.

LvGuangzu · 2022-11-08T13:10:09Z

Ok, I'll try to reproduce the LDC code over the next two days, and if it's not difficult to reproduce, I'll get back to you soon.

LvGuangzu · 2022-11-08T15:37:32Z

I used the data set BIPED in the project in the training process, and the training set contains 200 pictures. My loss dropped to 0.9 in the 34th round, but it rose to 2.1 in the 35th round, and the overall loss was around 2.

Did you change some hyperparameters? May you check with my lightweight model? LDC: Lightweight Dense CNN for Edge Detection I cannot find any error, just to make sure that the problem is not the data.

Hello, I have reproduced the LCD model, and I have modified some of the hyperparameters. Below I will list all the hyperparameters I have modified.

is_testing=True -> is_testing=False
epochs = 25 -> epochs = 50
adjust_lr = [6,12,18] -> adjust_lr = [12,24,36]
The BIPED dataset is still used for training, and resume=False is set, and the training is started from scratch.
The final result shows signs of convergence.Below is the loss curve I got after training for 50 epochs.

LvGuangzu · 2022-11-08T15:39:11Z

I used the data set BIPED in the project in the training process, and the training set contains 200 pictures. My loss dropped to 0.9 in the 34th round, but it rose to 2.1 in the 35th round, and the overall loss was around 2.

Did you change some hyperparameters? May you check with my lightweight model? LDC: Lightweight Dense CNN for Edge Detection I cannot find any error, just to make sure that the problem is not the data.

There is one place I don't understand, as shown in the figure below, what is the role of the seed here? Will it affect the results if I delete it? Maybe I was careless and didn't see the introduction of the relevant content, I hope you can help me answer it.

xavysp · 2022-11-08T23:24:56Z

I used the data set BIPED in the project in the training process, and the training set contains 200 pictures. My loss dropped to 0.9 in the 34th round, but it rose to 2.1 in the 35th round, and the overall loss was around 2.

Did you change some hyperparameters? May you check with my lightweight model? LDC: Lightweight Dense CNN for Edge Detection I cannot find any error, just to make sure that the problem is not the data.

Hello, I have reproduced the LCD model, and I have modified some of the hyperparameters. Below I will list all the hyperparameters I have modified.

is_testing=True -> is_testing=False

epochs = 25 -> epochs = 50

adjust_lr = [6,12,18] -> adjust_lr = [12,24,36]

The BIPED dataset is still used for training, and resume=False is set, and the training is started from scratch.
The final result shows signs of convergence.Below is the loss curve I got after training for 50 epochs.

Can a see the results? For example the edge-map of Lenna image.

xavysp · 2022-11-08T23:26:49Z

I used the data set BIPED in the project in the training process, and the training set contains 200 pictures. My loss dropped to 0.9 in the 34th round, but it rose to 2.1 in the 35th round, and the overall loss was around 2.

Did you change some hyperparameters? May you check with my lightweight model? LDC: Lightweight Dense CNN for Edge Detection I cannot find any error, just to make sure that the problem is not the data.

There is one place I don't understand, as shown in the figure below, what is the role of the seed here? Will it affect the results if I delete it? Maybe I was careless and didn't see the introduction of the relevant content, I hope you can help me answer it.

It does not matter in a large scale, just trying to generalize the edge detection by changing seed

LvGuangzu · 2022-11-09T01:44:31Z

I used the data set BIPED in the project in the training process, and the training set contains 200 pictures. My loss dropped to 0.9 in the 34th round, but it rose to 2.1 in the 35th round, and the overall loss was around 2.

Did you change some hyperparameters? May you check with my lightweight model? LDC: Lightweight Dense CNN for Edge Detection I cannot find any error, just to make sure that the problem is not the data.

Hello, I have reproduced the LCD model, and I have modified some of the hyperparameters. Below I will list all the hyperparameters I have modified.

is_testing=True -> is_testing=False

epochs = 25 -> epochs = 50

adjust_lr = [6,12,18] -> adjust_lr = [12,24,36]

The BIPED dataset is still used for training, and resume=False is set, and the training is started from scratch.
The final result shows signs of convergence.Below is the loss curve I got after training for 50 epochs.

Can a see the results? For example the edge-map of Lenna image.
Of course, I use the pth file output from the 26th round, because his loss=3.56, which is the lowest within 50 rounds.The first is the avg picture, the second is the fuse picture.

LvGuangzu · 2022-11-09T01:47:49Z

I used the data set BIPED in the project in the training process, and the training set contains 200 pictures. My loss dropped to 0.9 in the 34th round, but it rose to 2.1 in the 35th round, and the overall loss was around 2.

Did you change some hyperparameters? May you check with my lightweight model? LDC: Lightweight Dense CNN for Edge Detection I cannot find any error, just to make sure that the problem is not the data.

There is one place I don't understand, as shown in the figure below, what is the role of the seed here? Will it affect the results if I delete it? Maybe I was careless and didn't see the introduction of the relevant content, I hope you can help me answer it.

It does not matter in a large scale, just trying to generalize the edge detection by changing seed

Can I understand that the seed here is basically useless? Can I delete it later for training and testing?Or when do I need to use seeds?

xavysp · 2022-11-10T00:57:37Z

Yes is useless

xavysp · 2022-11-10T01:05:12Z

Here the lenna from de fused module then the average one. Results from LDC

LvGuangzu · 2022-11-12T11:48:24Z

Here the lenna from de fused module then the average one. Results from LDC

The effect looks very good. I would like to ask what is the final convergence loss of the BIPED dataset when you used the LDC model to train it. The catloss provided in the code I use

xavysp · 2022-11-14T22:23:59Z

Sorry I don't have access to my former lab, and I cannot take it. But I let you know whenever a have it

zwz-append · 2022-12-04T09:29:16Z

嗨，我可以看到张量板图吗？

非常抱歉我没有张张。从第一个纪元结束到的，安装损失在 1.5 到 5 之间的波动量板，所以让项目失败没有任何问题。下面的人也遇到了同样的问题。我有点烦恼。

你有多少训练数据？

Thank you for your reply, I am using BIPEDv2 data, 200 training images. The training loss is similar to the loss function curve of the previous layer reply, and there is no tendency to converge.

Hi, i have problem in training process of Pytorch version.I made no changes to the project and used the original BIPEDv2 dataset for training, and the parameters used the default parameters of the project. After training for 17 epochs, the loss barely changes. In the end, it can predict the image, but the effect is not very good. What could be the reason? Looking forward to your reply.

Hello, I have a question about dataset.py when i use bipedv2 dataset for training. I have changed the 'data_dir' in main.py , and data_types= ['aug'] -> data_types= ['real'] in lin 322 and 330 of dataset.py. However, the system show me that NotADirectoryError: [WinError 267] 目录名称无效。: 'E:\SZUer\codes\CVweizhu\DexiNed-master\dataset-lists\BIPEDv2\BIPED\edges\imgs\train\rgbr\real\RGB_001.jpg'.
I have tried to test lin 368-374 of dataset.py , but it can't success.
However I can ues bipedv2 dataset for testing. I don't know how to make it.

LvGuangzu · 2022-12-12T12:25:35Z

Sorry I don't have access to my former lab, and I cannot take it. But I let you know whenever a have it

OK, thank you. I used the data enhancement operation you used. The current data volume is 288 * 200 images. Our laboratory can only use a 3090 gpu. It takes an hour and a half to train one epoch of the dexined model. So I want to ask how many times you converged at that time? I saw in Dexined's paper that you iterated 150k times when doing experiments. Is that 150 rounds or 150,000 rounds?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

loss does not converge #125

loss does not converge #125

DLC-jjj commented Nov 5, 2022

xavysp commented Nov 6, 2022

LvGuangzu commented Nov 7, 2022

DLC-jjj commented Nov 7, 2022

DLC-jjj commented Nov 7, 2022

xavysp commented Nov 8, 2022

xavysp commented Nov 8, 2022

LvGuangzu commented Nov 8, 2022

LvGuangzu commented Nov 8, 2022

LvGuangzu commented Nov 8, 2022

DLC-jjj commented Nov 8, 2022

xavysp commented Nov 8, 2022

LvGuangzu commented Nov 8, 2022

LvGuangzu commented Nov 8, 2022

LvGuangzu commented Nov 8, 2022

LvGuangzu commented Nov 8, 2022

xavysp commented Nov 8, 2022

xavysp commented Nov 8, 2022

LvGuangzu commented Nov 9, 2022

LvGuangzu commented Nov 9, 2022

xavysp commented Nov 10, 2022

xavysp commented Nov 10, 2022 •

edited

Loading

LvGuangzu commented Nov 12, 2022

xavysp commented Nov 14, 2022

zwz-append commented Dec 4, 2022

LvGuangzu commented Dec 12, 2022

loss does not converge #125

loss does not converge #125

Comments

DLC-jjj commented Nov 5, 2022

xavysp commented Nov 6, 2022

LvGuangzu commented Nov 7, 2022

DLC-jjj commented Nov 7, 2022

DLC-jjj commented Nov 7, 2022

xavysp commented Nov 8, 2022

xavysp commented Nov 8, 2022

LvGuangzu commented Nov 8, 2022

LvGuangzu commented Nov 8, 2022

LvGuangzu commented Nov 8, 2022

DLC-jjj commented Nov 8, 2022

xavysp commented Nov 8, 2022

LvGuangzu commented Nov 8, 2022

LvGuangzu commented Nov 8, 2022

LvGuangzu commented Nov 8, 2022

LvGuangzu commented Nov 8, 2022

xavysp commented Nov 8, 2022

xavysp commented Nov 8, 2022

LvGuangzu commented Nov 9, 2022

LvGuangzu commented Nov 9, 2022

xavysp commented Nov 10, 2022

xavysp commented Nov 10, 2022 • edited Loading

LvGuangzu commented Nov 12, 2022

xavysp commented Nov 14, 2022

zwz-append commented Dec 4, 2022

LvGuangzu commented Dec 12, 2022

xavysp commented Nov 10, 2022 •

edited

Loading