-
Thanks for sharing your method and code! I'm curious if there is any prior work on using CORN in conjunction with typical cross-entropy loss in a multi-task learning setting. To be concrete, I'm working on a model that includes (N < 10) classification heads on top of a transformer backbone. Some of these classification heads learn probability distributions for fundamentally ranked classes (e.g. a spice_level of 2, 3, 4, 5), while some of these classification heads learn probability distributions for non-ranked classes (e.g. spice_selected of paprika, thyme, cumin). I train this network by summing the losses for each of the classification heads and backpropping through the entire network. As such, I'm curious if there is a way to map a CE loss to a CORN loss (or vice-versa). Thank you again! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
I haven't done experiments for this, but I guess you could also include the CORN loss in the sum. I.e., OverallLoss = CE1 + CE2 + ... + CE9 + |
Beta Was this translation helpful? Give feedback.
I haven't done experiments for this, but I guess you could also include the CORN loss in the sum. I.e., OverallLoss = CE1 + CE2 + ... + CE9 +$\alpha$ CornLoss