-
-
Notifications
You must be signed in to change notification settings - Fork 883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ds model card, rebased #2101
base: main
Are you sure you want to change the base?
Conversation
src/axolotl/core/trainer_builder.py
Outdated
@@ -410,10 +426,12 @@ def __init__( | |||
*_args, | |||
bench_data_collator=None, | |||
eval_data_collator=None, | |||
dataset_tag_names=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dataset_tag_names
isn't used for the base Trainer is it? Only for DPOTrainer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is used for both AxolotlTrainer and AxolotlDPOTrainer within def push_to_hub
via _sanitize_kwargs_for_ds_tagging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not fond of introducing arguments that don't match the original Trainer signature for this case. The call to _sanitize_kwargs_for_ds_tagging
should handle the case where the attribute isn't available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the advice. While trl trainer has an override for def create_model_card
over the base transformers trainer, it uses the same def push_to_hub
inherited from the base transformers trainer. I changed dataset_tag_names
to dataset_tags
to match def push_to_hub
Description
Continuation of PR#1015,#2094 with rebasing
Motivation and Context
How has this been tested?
Screenshots (if appropriate)
Types of changes
Social Handles (Optional)