Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: allow picking whatever pytorch lr_scheduler #79

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

emanjavacas
Copy link
Owner

Here is a draft for the lr_scheduler params (related to #78) and how to avoid breaking backwards compatibility. Could you use it to test the results of the cosineannealing that you got?

Copy link
Contributor

@PonteIneptique PonteIneptique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before I try this version, there are a few comments and it's missing an implementation of the DelayerScheduler.

pie/trainer.py Outdated
self.lr_scheduler.num_bad_epochs,
self.lr_scheduler.patience,
self.lr_scheduler.threshold)
res = '<LrScheduler lr="{:g}"'.format(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the stuff I added, we are missing the type of LrScheduler here.

self.lr_scheduler.threshold)
res = '<LrScheduler lr="{:g}"'.format(
self.lr_scheduler.optimizer.param_groups[0]['lr'])
for key in dir(self.lr_scheduler):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are losing some manual filter on what we want to show here

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but otherwise you have to do it manually per class...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know... Couldn't we curate for specific scheduler ? And automatically show for others ?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as long as it doesn't introduce too much clutter, feel free!

"lr_patience": 2, // patience for lr schedule

"lr_scheduler": "ReduceLROnPlateau",
"lr_scheduler_params": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely like this part.

Copy link
Contributor

@PonteIneptique PonteIneptique Dec 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, as I mentioned in my other PR, this basically breaks when initiating something else than a Adam.

Another option would be to have

{
"lr_scheduler_params": {
  "RecudeLROnPlateau": {
    "mode": "max", ...
    }
  }
}

and this way provide defaults for things we tested ?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But those parameters are not specific from the optimizer but from the scheduler, right? I have made some changes that should address this issue...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry, I meant scheduler. I am a bit tired right now ^^. So I corrected the comment up there.

@emanjavacas
Copy link
Owner Author

From the last discussion,t we concluded to leave out the stuff coming from 3D party libraries (including the delayed stuff)

@PonteIneptique
Copy link
Contributor

Delayer is not third party, I coded it :) And it's only with Delayer that Cosine outperforms significantly Adam :)

@emanjavacas
Copy link
Owner Author

Could you pin point what is it that the Delayer class is achieving? I see a lot of new code involved in printing stuff and so on. Perhaps there is an easier way to add what the Delayer class contributes.

@PonteIneptique
Copy link
Contributor

There is definitely. May I pr your PR ?

@emanjavacas
Copy link
Owner Author

Ye, go ahead. Only I'd rather really keep it simple. For instance if there is that class that delays the step, we don't need to add a bunch of new functions to adapt the printing code for that class.

PonteIneptique and others added 2 commits December 7, 2020 17:15
* Fix code typo in trainer.py

* (Trainer) Implement Delay on LRScheduler
@emanjavacas emanjavacas changed the title adapt Feature: allow picking whatever pytorch lr_scheduler Dec 7, 2020
@emanjavacas
Copy link
Owner Author

Check the last commit, that might solve the issues

Copy link
Contributor

@PonteIneptique PonteIneptique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried it with ReduceLROnPlateau, Delay(ReduceLR...), Delay(Cosine).
It seems to work fine, delay is applied correctly (lr_scheduler.last_epoch is incremented only after delay is reached)

@PonteIneptique
Copy link
Contributor

Great !

@emanjavacas
Copy link
Owner Author

emanjavacas commented Dec 7, 2020 via email

@PonteIneptique
Copy link
Contributor

Running:

   "lr_scheduler": "CosineAnnealingLR", // LR Scheduler to use: ReduceLROnPlateau,
   "lr_scheduler_delay": 10,
   "lr_scheduler_params": {
      "T_max": 40,
      "min_lr": 0.000001, // minimum learning rate
      "lr_patience":7,
      "factor":0.6,
   },

@PonteIneptique
Copy link
Contributor

You don't want me to rerun the 2x10 runs right ?

@PonteIneptique
Copy link
Contributor

PonteIneptique commented Dec 7, 2020

It works: ran once on my personal GPU for 67 epochs (higher parts of the box in #76 (comment)) with score within the box 97.61. (previous version of comment was using dev score, because I am tired and can't read)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants