update train_sts_seed_optimization with SentenceTransformerTrainer #3092

JINO-ROHIT · 2024-11-27T15:30:19Z

This PR updates the example script train_sts_seed_optimization.py with SentenceTransformerTrainer.

I also noticed the documentation was quite outdated when I was referring for some args, should we try and look to update them too?

@tomaarsen

tomaarsen · 2024-11-27T18:00:47Z

examples/training/data_augmentation/train_sts_seed_optimization.py

    # Configure the training. We skip evaluation in this example
-    warmup_steps = math.ceil(len(train_dataloader) * num_epochs * 0.1)  # 10% of train data for warm-up
+    warmup_steps = math.ceil(len(train_dataset) * num_epochs * 0.1)  # 10% of train data for warm-up


The SentenceTransformerTrainingArguments has a warmup_ratio=0.1 that we can use instead.

tomaarsen · 2024-11-27T18:01:10Z

examples/training/data_augmentation/train_sts_seed_optimization.py


    # Stopping and Evaluating after 30% of training data (less than 1 epoch)
    # We find from (Dodge et al.) that 20-30% is often ideal for convergence of random seed
-    steps_per_epoch = math.ceil(len(train_dataloader) * stop_after)
+    steps_per_epoch = math.ceil(len(train_dataset) * stop_after)


I don't think this is used right now

tomaarsen · 2024-11-27T18:06:17Z

examples/training/data_augmentation/train_sts_seed_optimization.py

-        steps_per_epoch=steps_per_epoch,
-        evaluation_steps=1000,
+    # 5. Define the training arguments
+    args = SentenceTransformerTrainingArguments(


I think the stop_after isn't actually making it stop after this many steps.
Normally you can use max_steps, but then I think it messes with the scheduler, ideally we want the scheduler to be "normal" but then still stop after stop_after steps, but I'm not sure if that's the old behaviour either.

tomaarsen · 2024-11-27T18:07:03Z

I'm also curious what you mean with the outdated docs - I'd also like that to be fixed if possible.

JINO-ROHIT · 2024-11-27T18:22:07Z

ahh yeah ill change those.

sorry about the docs, i was accidentally referring to the old fit method here - https://www.sbert.net/docs/package_reference/sentence_transformer/SentenceTransformer.html#:~:text=steps_per_epoch%20%E2%80%93%20Number%20of%20training%20steps,the%20DataLoader%20size%20from%20train_objectives.&text=warmup_steps%20%E2%80%93%20Behavior%20depends%20on%20the,to%20the%20maximal%20learning%20rate. and saw args like steps_per_epoch and warmup_steps that werent there in the Trainer

JINO-ROHIT · 2024-11-28T06:58:32Z

also, i dont quite understand the stop_after bit as well, is a custom callback expected?

tomaarsen · 2024-11-28T14:30:03Z

also, i dont quite understand the stop_after bit as well, is a custom callback expected?

Makes sense, this is a little confusing. I think the idea is that we create 1 epoch of e.g. 100k steps. The seed (e.g. for data sampling) has been shown to be fairly important for training embedding models, so we want to train e.g. the first 30k steps out of the 100k and then see where we're at. Then we can pick the seed that performed the best after just a bit of training.

But if we use max_steps=0.3 * total_steps, then our scheduler will also recognize that we're only doing 30k steps, and update accordingly.

Instead, we want the scheduler to think that we're doing 100k steps, but indeed we want the training to stop after 30k (or stop_after * total_steps). I think a custom callback is indeed a good solution, I'll whip something up and add it to this PR.

Tom Aarsen

tomaarsen · 2024-11-28T15:09:58Z

This was my final log with this script at the default parameters:

Current sts-dev_spearman_cosine Scores per Seed: {7: 0.8510304677377857,
 9: 0.8487004831766769,
 5: 0.8486879350254634,
 1: 0.8446502257325709,
 2: 0.8422240179194527,
 4: 0.8402418940176778,
 6: 0.8381279059862979,
 0: 0.8367594493128546,
 3: 0.831016545559169,
 8: 0.8271094833035064}

So there's indeed a pretty big difference, 0.827 VS 0.851.

Tom Aarsen

JINO-ROHIT · 2024-11-28T15:53:41Z

ah thanks for explaining, this makes sense.
whup, there still seems to be a diff with your custom callback? is it possible because of the randomness of the seed itself?

tomaarsen · 2024-11-28T16:50:12Z

A difference? Between the evaluation scores you mean?
Indeed, those are because of the different seed (which impacts a lot in the training process, from data sampling to dropout, and also weight initialization if you're training fully from scratch).

Tom Aarsen

JINO-ROHIT · 2024-11-29T04:32:20Z

ahh okay okay

tomaarsen · 2024-12-02T10:08:40Z

Thanks for tackling this!

Tom Aarsen

JINO-ROHIT added 2 commits November 27, 2024 20:58

update train_sts_seed_optimization with SentenceTransformerTrainer

3b3b315

ruff lint

12e3b32

tomaarsen reviewed Nov 27, 2024

View reviewed changes

fixes

4be4fba

Add stopping callback, should work now

286affd

tomaarsen merged commit 39b6eae into UKPLab:master Dec 2, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update train_sts_seed_optimization with SentenceTransformerTrainer #3092

update train_sts_seed_optimization with SentenceTransformerTrainer #3092

JINO-ROHIT commented Nov 27, 2024

tomaarsen Nov 27, 2024

tomaarsen Nov 27, 2024

tomaarsen Nov 27, 2024

tomaarsen commented Nov 27, 2024

JINO-ROHIT commented Nov 27, 2024

JINO-ROHIT commented Nov 28, 2024

tomaarsen commented Nov 28, 2024

tomaarsen commented Nov 28, 2024

JINO-ROHIT commented Nov 28, 2024

tomaarsen commented Nov 28, 2024

JINO-ROHIT commented Nov 29, 2024

tomaarsen commented Dec 2, 2024

update train_sts_seed_optimization with SentenceTransformerTrainer #3092

update train_sts_seed_optimization with SentenceTransformerTrainer #3092

Conversation

JINO-ROHIT commented Nov 27, 2024

tomaarsen Nov 27, 2024

Choose a reason for hiding this comment

tomaarsen Nov 27, 2024

Choose a reason for hiding this comment

tomaarsen Nov 27, 2024

Choose a reason for hiding this comment

tomaarsen commented Nov 27, 2024

JINO-ROHIT commented Nov 27, 2024

JINO-ROHIT commented Nov 28, 2024

tomaarsen commented Nov 28, 2024

tomaarsen commented Nov 28, 2024

JINO-ROHIT commented Nov 28, 2024

tomaarsen commented Nov 28, 2024

JINO-ROHIT commented Nov 29, 2024

tomaarsen commented Dec 2, 2024