The concept of Global Average Pooling (GAP) was first proposed in the paper "Network in Network" by Min Lin et al.
The main idea of GAP can be divided into 2 part:
1. Generate one feature map for each category;
2. Calculate the average of each feature map, and these average values serve as probability (after softmax) of each category.
My point of view:
GAP should be useful in the context of image classification tasks, but it is trivial in semantic segmentation. Because the first step is necessary in semantic segmentation, and the second step conflicts with the goal of classifying each pixel.