Skip to content

tisu19021997/cyclical-scheduler

Repository files navigation

Cyclical Learning Rate (CLR) and 1-Cycle Policy as Keras's callback.

An online Google Colab notebook is available here.

Me: https://github.com/tisu19021997

Short Description:

  • In short, the Cyclical LR method allows the learning rate of a model to vary between two boundaries when training. By that, it provides substantial improvements in performance for different architectures. Cyclical LR divides the training phase into cycles and each cycle consists of 2 steps.
  • The 1-Cycle policy uses the cyclical LR method but only with 1 cycle for the whole training. Moreover, this policy suggests that "always use one cycle that is smaller than the total number of iterations/epochs and allow the learning rate to decrease several orders of magnitude less than the initial learning rate for the remaining iterations".
  • There are 2 variations of 1-Cycle policy that I found when doing my research:
    • In the first variation, the learning rate varies in 3 states:
      1. from base_lr to max_lr
      2. from max_lr to base_lr
      3. from base_lr to min_lr (where min_lr=base_lr/some_factor)
    • In the second variation (which I am using here), the learning rate varies in 2 states:
      1. from base_lr to max_lr
      2. from max_lr to min_lr

Preferences

Related Works

TODO

  • Cyclical Momentum
  • Learning Rate finder (similar to the one that fastai implemented)
  • Consine Annealing (like PyTorch's)
  • Unit test