4

Hinton doesn't believe in the pooling operation (video). I also heard that many max-pooling layers have been replaced by convolutional layers in recent years, is that true?

nbro
  • 39,006
  • 12
  • 98
  • 176
user559678
  • 101
  • 4

2 Answers2

3

In addition to JCP's answer I would like to add some more detail. At best, max pooling is a less than optimal method to reduce feature matrix complexity and therefore over/under fitting and improve model generalization(for translation invariant classes).

However as JCP begins to hit on.. there are problems with this method. Hinton perhaps sums the issues in his talk here on what is wrong with CNNs. This also serves as motivation for his novel architecture capsule networks or just capsules.

As he talks about, the main problem is not translational variance per se but rather pose variance. CNNs with max pooling are more than capable of handling simple transformations like flips or rotation without too much trouble. The problem comes with complicated transforms, as features learned about a chair facing forwards, will not be too helpful towards class representation if the real-world examples contain chairs upside down, to the side, etc.

However there is much work being done here, mostly constrained to 2 areas. Those being, novel architectures/methods and inference of the 3d structure from images(via CNN tweaks). This problem was one of the bigger motivators for researchers throughout the decades, even David Marr with his primal sketches.

hisairnessag3
  • 1,235
  • 5
  • 15
1

Max pooling isn't bad, it just depends of what are you using the convnet for. For example if you are analyzing objects and the position of the object is important you shouldn't use it because the translational variance; if you just need to detect an object, it could help reducing the size of the matrix you are passing to the next convolutional layer. So it's up to the application you are going to use your CNN.

JCP
  • 173
  • 12