Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loss #29

Open
yanzichuan opened this issue Nov 18, 2024 · 5 comments
Open

loss #29

yanzichuan opened this issue Nov 18, 2024 · 5 comments

Comments

@yanzichuan
Copy link

Hello,

I hope you're doing well. I would appreciate your insights regarding some loss values I observed during training. Specifically, I encountered the following values for v_num=12:

val_loss = -1.61
val_d_loss = -1.72
val_d_loss0 = -1.72
val_g_loss = 0.109
val_rec_loss = 0.116
val_adv_loss = -0.00658
I am curious if these values seem reasonable, especially given the negative values for val_loss and val_d_loss. I would appreciate any guidance on whether this behavior is expected or if it may indicate an issue with my setup or training process.

Thank you in advance for your help!

Best regards,
Zack

@yanzichuan
Copy link
Author

Thank you for the amazing work your team has done and for sharing your contributions in Facial Expression Recognition (FER). I was wondering if it would be possible for you to provide the training and inference code for LP (Linear Probing) or FT (Fine-Tuning) used in your FER experiments. It would be incredibly helpful for furthering my understanding and experiments.

@ControlNet
Copy link
Owner

Hi Zack,

I am curious if these values seem reasonable, especially given the negative values for val_loss and val_d_loss. I would appreciate any guidance on whether this behavior is expected or if it may indicate an issue with my setup or training process.

The d_loss is the WGAN loss which is fine to be negative.

$L_{adv}^{(d)} = discriminator(fake\ samples) - discriminator(real\ samples)$

Code for FER.

The code for FER is very similar to the code for attributes classification, as it's just adding the MARLIN encoder and a linear layer. You can easily convert the code for attribute classification to FER.

@yanzichuan
Copy link
Author

Regarding Negative Loss Values
Thank you for your explanation. I would like to clarify my observations when running the MARLIN training code. After training for 2000 epochs, the checkpoint saved is either the one with the minimum loss or from the final training epoch. I noticed that the minimum loss recorded was -1.777 at epoch 5, while the loss at the final epoch was -0.0013. Could you please confirm if, for MARLIN, having a smaller (more negative) loss is better, or if a loss closer to zero is more desirable? Otherwise, it seems that training for 2000 epochs might not yield much benefit. I would appreciate your clarification. Thank you.

Attribute Classification Code and CMU-MOSEI Dataset
I could not find the attribute classification code on your GitHub repository. Additionally, regarding the CMU-MOSEI dataset, did you use mosei_senti_data.pkl or mosei.hdf5? Currently, I can only find four versions: mosei_raw.pkl, mosei_senti_data_pkl, mosei_unalign.hdf5, and mosei.hdf5. Could you please guide me on which dataset to use and how to utilize it? If possible, I would greatly appreciate it if you could provide your code to facilitate my learning. Thank you again for your support.

@yanzichuan
Copy link
Author

Thank you sincerely for your timely response and for addressing my questions. I truly appreciate your support and valuable insights.

@yanzichuan
Copy link
Author

yanzichuan commented Nov 18, 2024

** Should the save loss be set this way?**

checkpoint_callback = ModelCheckpoint(
dirpath="checkpoints",
filename="best_generator_model",
monitor="val_g_loss",
mode="min",
save_top_k=1,
verbose=True
)
please you tell me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants