Add adjust_lr function for learning rate schedules #247
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
adjust_lr
function toiterate_dataset.py
to support learning rate schedules with warmup and cooldown phasesDatasetBatch
to includetotal_steps
field needed for LR calculationsImplementation Details
The
adjust_lr
function supports:warmup_length
: Linear ramp from 0 to base LR (can be int for steps or float for ratio)cooldown_length
: Linear decay from base LR to 0 (can be int for steps or float for ratio)Status
This is a DRAFT PR - we'll make it final once we've had a chance to test it on some real runs.
Context
We've had good success with a constant learning rate in our experiments, but there may be some benefit to warmup and cooldown phases that we need to investigate through empirical testing.
🤖 Generated with Claude Code