Why does the BatchNormalization layer produce different outputs during training and inference?

Asked Jan 30 '20 at 14:50

Active Jan 30 '20 at 16:44

Viewed 86 times

I modified resnet50 architecture to get a regression network. I just add batchnorm1d and ReLU layers just before the fully connected layer. During the training, the output of batchnorm1d layer is nearly equal to 3 and this gives good results for training. However, during inference, output of batchnorm1d layer is about 30 so this leads to too low accuracy for test results. In other words, outputs of batchnorm1d layers give very different normalized output during training and inference.

What is the reason for this situation, and how can I solve it? I am using PyTorch.

edited Jan 30 '20 at 16:44

nbro

39,006
12
98
176

asked Jan 30 '20 at 14:50

Bedrick Kiq

1

What's your batch size? BN is known to work poorly with small batch sizes. – yhenon Jan 30 '20 at 16:31
batch size is 16 – Bedrick Kiq Jan 31 '20 at 06:45
are you confident you are applying the same pre-processing in training and in testing? – yhenon Feb 01 '20 at 14:03
yes, ı used exactly same data for training and testing. The only difference between them is batch size. – Bedrick Kiq Feb 03 '20 at 07:34
did you shuffle your batches properly? meaning samples of the same class aren't always in the same mini-batch – yhenon Feb 03 '20 at 16:24

Why does the BatchNormalization layer produce different outputs during training and inference?

0 Answers0