3

I have been reading this TensorFlow tutorial on transfer learning, where they unfroze the whole model and then they say:

When you unfreeze a model that contains BatchNormalization layers in order to do fine-tuning, you should keep the BatchNormalization layers in inference mode by passing training=False when calling the base model. Otherwise the updates applied to the non-trainable weights will suddenly destroy what the model has learned.

My question is: why? The model's weights are adapting to the new data, so why do we keep the old mean and variance, which was calculated on ImageNet? This is very confusing.

nbro
  • 39,006
  • 12
  • 98
  • 176
dato nefaridze
  • 862
  • 6
  • 20

0 Answers0