Are you sure .train() puts batch norm in eval mode? I'd assume that means train... #23

brando90 · 2024-09-10T23:07:51Z

Line 71 in c5795e5

# Fix batch norm running statistics (i.e., put batch_norm layers in eval mode)

        # Fix batch norm running statistics (i.e., put batch_norm layers in eval mode)
        self.model.train()

is this truly correct?

.train() usually puts layers in training mode. So for batch norm what it means is that it start collecting running statistics but uses mini-batch stats if I remember correctly, while in .eval() it uses the saved running stats. Right?

brando90 · 2024-09-10T23:15:42Z

    # Since we are fine-tuning the model during for T2V/FIM computation, .train() is the right choice in general.
    self.model.train()

brando90 · 2024-09-10T23:16:04Z

For batch norm comment:

# Since we are fine-tuning the model during T2V/FIM computation, .train() is the right choice as it ensures batch norm uses mini-batch statistics and properly adapts the model to the new task.
self.model.train()

but LLMs don't really use batch norm so doesn't matter...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are you sure .train() puts batch norm in eval mode? I'd assume that means train... #23

Are you sure .train() puts batch norm in eval mode? I'd assume that means train... #23

brando90 commented Sep 10, 2024

brando90 commented Sep 10, 2024

brando90 commented Sep 10, 2024 •

edited

Loading

Are you sure .train() puts batch norm in eval mode? I'd assume that means train... #23

Are you sure .train() puts batch norm in eval mode? I'd assume that means train... #23

Comments

brando90 commented Sep 10, 2024

brando90 commented Sep 10, 2024

brando90 commented Sep 10, 2024 • edited Loading

brando90 commented Sep 10, 2024 •

edited

Loading