2

I have constructed a CNN that utilizes max-pooling layers. I have found with these layers that, should I remove them, my network performs ideally with every output and gradient at each layer having a variance close to 1. However, if they are included, the variance skyrockets.

This makes sense, of course, as a max-pooling layer takes the maximum of an area, which must incur a positive bias as larger numbers are chosen.

I would just like to know what methods are typically used to combat this.

Pluviophile
  • 1,223
  • 5
  • 17
  • 37
Recessive
  • 1,346
  • 8
  • 21
  • Remove max pooling obviously :) Seriously, if you have bias/variance problem batch normalization, layer normalization, instance normalization etc sometime helps. – mirror2image Jul 01 '19 at 06:19

0 Answers0