Increase EMA-parameter during training #82

Benjamin-Hansson · 2022-05-16T08:29:20Z

Hi, I noticed that the EMA-parameter (called beta in the code, τ in the paper) is not updated during training. In the paper they describe that they increase τ from the start value to 1 during training: "Specifically, we set τ = 1 − (1 − τbase) · (cos(πk/K) + 1)/2 with k the current training step and K the maximum number of training steps."
This makes a huge difference to the validation loss at the end of the training.

michal-choinski · 2022-08-25T15:39:18Z

@Benjamin-Hansson just to clarify - when did you observe this spiking loss? Was it when the EMA parameter was not updated, or the other way around?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase EMA-parameter during training #82

Increase EMA-parameter during training #82

Benjamin-Hansson commented May 16, 2022 •

edited

michal-choinski commented Aug 25, 2022

Increase EMA-parameter during training #82

Increase EMA-parameter during training #82

Comments

Benjamin-Hansson commented May 16, 2022 • edited

michal-choinski commented Aug 25, 2022

Benjamin-Hansson commented May 16, 2022 •

edited