lr rate of bias when training on multi gpus #19

algorithmsummer · 2020-11-23T07:46:57Z

Hi, should lr rate of bias be multiplied the num of gpus in the following codes ?

def make_optimizer(cfg, model, num_gpus=1):
    params = []
    for key, value in model.named_parameters():
        if not value.requires_grad:
            continue
        lr = cfg.SOLVER.BASE_LR * num_gpus
        # linear scaling rule
        weight_decay = cfg.SOLVER.WEIGHT_DECAY
        if "bias" in key:
            lr = cfg.SOLVER.BASE_LR * cfg.SOLVER.BIAS_LR_FACTOR
            weight_decay = cfg.SOLVER.WEIGHT_DECAY_BIAS
        params += [{"params": [value], "lr": lr, "weight_decay": weight_decay}]
    if cfg.SOLVER.OPTIMIZER_NAME == 'SGD':
        optimizer = getattr(torch.optim, cfg.SOLVER.OPTIMIZER_NAME)(params, momentum=cfg.SOLVER.MOMENTUM)
    else:
        optimizer = getattr(torch.optim, cfg.SOLVER.OPTIMIZER_NAME)(params)
    return optimizer

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lr rate of bias when training on multi gpus #19

lr rate of bias when training on multi gpus #19

algorithmsummer commented Nov 23, 2020

lr rate of bias when training on multi gpus #19

lr rate of bias when training on multi gpus #19

Comments

algorithmsummer commented Nov 23, 2020