Pooling CNN in ML & CV

Pooling is a form of non-linear down-sampling(image reduction) with an aggregation function.

Max pooling is the most common – It partitions the input image into a set of non-overlapping rectangles(N x N) and, for each such sub-region, outputs the maximum,
thus, practically extracting the most significant features.

The non-overlapping pooling for max pooling is performed with a stride that is the same as its dimensions or below:
Given (N) height and (N) width the vertical and horizontal stride (S) will be (S>=N).

The Pooling Layer Advantages:

1. Reduces the spatial size of the representation which reduces memory size.
2. Reduces the number of parameters and amount of computation in the network, which can mitigate and control the overfitting.
3. Contributes to the CNN to detect objects independent of location (location invariance)

The pooling layer reduces the size of an input image but doesn’t change the depth of input image.