重点提炼

1 本文目标

提出一种更有效的特征金字塔（feature pyramid），用于处理目标检测中的多尺度物体这一难点

image.png

SSD类型的feature pyramid，直接利用backbone的不同level的feature来做detection
FPN类型的feature pyramid，multi-level的feature map进行融合得到feature pyramid
STDN类型的feature pyramid，利用denseNet中最后一个block 的不同layer进行预测，听起来有点像SSD，但是和SSD合适不太一样。他是将相同大小的feature map分别下采样以及上采样得到不同大小分辨率的feature map，再进而进行detection。这样做的一个道理是非常信任denseNet最后一个dense block提取特征的能力
本文提出的feature pyramid. $\color{Red}{\text{注：可以看到本文feature pyramid的复杂程度，因此也就意味着速度gg了}}$

Generally speaking, the above-mentioned methods have the two following limitations.

First, feature maps in the pyramid are not representative enough for the object detection task, since they are simply constructed from the layers (features) of the backbone designed for object classification task.
第一，因为金字塔中的feature map来自于用于分类任务的backbone的layers，因此在目标检测中的任务中表达能力不够。 $\color{Red}{\text{注：好牵强啊！！！}}$
Second, each feature map in the pyramid (used for detecting objects in a specific range of size) is mainly or even solely constructed from single-level layers of the backbone, that is, it mainly or only contains single-level information.
第二，金字塔中各个feature map主要由backbone中的单个层级的构成，这就意味着每个feature map主要或者仅仅包含单层。 $\color{Red}{\text{注：这解释什么鬼，感觉和上面一样硬生生的弄出来的}}$