Paper | Detecting Twenty-thousand Classes using Image-level Supervision

写在前面

  • 文章出处: ECCV 2022
  • 模型名字: Detic
  • 整体概括:这篇文章跟最开始的OVD-Net一样,都是从pretraining的角度解决open vocabulary的问题,但是这篇文章的思路更加简单粗暴,直接加入imagenet的类别作为训练。本质上不是真正的open vocabulary,但是能够囊括2000类别;

1. Introduction:

  1. OD has two subtasks: 1) finding boxes (localization); 2) naming the boxes (classification)

  2. Previous works couple these two subtasks;

  3. however, the detection benchmarks are much smaller than the classification benchmark;

as in the fig, both the image number and the category number of LVIS (OD) are much smaller than ImageNet (CLS).

image.png

This paper:

propose a detector with image classes (Detic) that uses image-level supervision in addition to detection supervision.

  • decouple the localization and classification sub-problems;

  • use image-level labels to train the classifier and broaden the vocabulary of the detector;

illustration:

image.png

standard OD: need gt boxes and labels;

weakly supervised od: assign image-level labels to predicted boxes [error-prone]

this paper: assigns image-level labels to the max-size proposals.

2 Method

2.1 preliminary

  • detection dataset D_{det}, with class set C_{det}

  • image classification dataset D_{cls}, with class set C_{cls}

  • testing dataset with class set C_{test}.

  • C_{det}, C_{cls}, and C_{test} may or may not overlap.

tradional OD: C_{test} =C_{det},D_{cls} = \phi $

OVD: allows C_{test} \neq C_{det}

2.2 Detic

the whole idea is quite simple.

  • use both the detection dataset D_{det} and the classifiction dataset D_{cls} to train the detection model.
image.png
  1. sample a batch from both D_{det} and D_{cls}.

  2. if image belongs to D_{det}, then loss = typical od loss, rpn loss + rg loss + cls loss

  3. if image belongs to D_{cls}, then loss = max-size loss, max-size means the proposal has the max size is finally regarded as the region, then used to caculate the cls loss.

image.png
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

友情链接更多精彩内容