Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: allow to pass --ckpt.resume and start from scratch if no ckpt f… #26

Merged
merged 3 commits into from
Aug 23, 2024

Conversation

samsja
Copy link
Collaborator

@samsja samsja commented Aug 23, 2024

…iles are present.

This allow to pass the same command for starting from scratch and for auto resuming. Making it easier to handle elastic jobs

Copy link
Member

@Jackmin801 Jackmin801 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! does the wandb resume auto work? Ive never tried it before

@samsja
Copy link
Collaborator Author

samsja commented Aug 23, 2024

nice! does the wandb resume auto work? Ive never tried it before

yes, it works! It works even too much, even if I start a different run (ex change a hp) it will resuse the old run. I just changed the behavior to only trigger wandb auto resume if we are restarting from a ckpt.

It might still be a problem if I switch from one run to another when I restart from ckpt but different run from the one before. But I think we can live with that for now. just being careful

@samsja samsja merged commit ad3a344 into main Aug 23, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants