LinearWarmupScheduler
This LRScheduler is used to warm up the learning rate at the beginning of training and decay the learning rate after reaching the specified learning rate.
class LinearWarmupScheduler:
def __init__(self, optimizer, num_warmup_steps: int, num_training_steps: int):
...
...
Construction parameters:
optimizer: optimizer instance
num_warmup_steps: Number of warmup steps
num_training_steps: Total training steps
These parameters can be set through the model’s set_lr_scheduler:
model.set_lr_scheduler(LinearWarmupScheduler, num_warmup_steps=10, num_training_steps=100)
The optimizer parameter does not need to be passed in; the model module will automatically add it internally.
Megatron models do not support this Scheduler.