论文阅读
1.Detecting Unseen Visual Relations Using Analogies
作者
Julia Peyre1,2 Ivan Laptev1,2 Cordelia Schmid2,4 Josef Sivic1,2,
ICCV2019
数据集
HICO-DET,COCO,UnRel dataset
摘要:
We seek to detect visual relations in images of the form of triplets t = (subject, predicate, object), such as “person riding dog”, where training examples of the individual entities are available but their combinations are unseen at training. This is an important set-up due to the combinatorial nature of visual relations : collecting sufficient training data for all possible triplets would be very hard. The contributions of this work are three-fold. First, we learn a representation of visual relations that combines (i) individual embeddings for subject, object and predicate together with (ii) a visual phrase embedding that represents the relation triplet. Second, we learn how to transfer visual phrase embeddings from existing training triplets to unseen test triplets using analogies between relations that involve similar objects. Third, we demonstrate the benefits of our approach on three challenging datasets : on HICO-DET, our model achieves significant improvement over a strong baseline for both frequent and unseen triplets, and we observe similar improvement for the retrieval of unseen triplets with out-ofvocabulary predicates on the COCO-a dataset as well as the challenging unusual triplets in the UnRel dataset.
贡献
1.首先,通过学习主词、宾语、谓语和视觉短语的互补的视觉语言嵌入,我们利用了构成和视觉短语表示的优势。
2.其次,我们建立了一个类比迁移模型,以获得从未见过的视觉短语嵌入关系。
3.第三,我们对三个具有挑战性的数据集进行了实验评估,在这些数据集中,我们展示了我们的方法在频繁关系和不可见关系上的好处。
其中数据集
UnRel dataset 来自
http://openaccess.thecvf.com/content_ICCV_2017/papers/Peyre_Weakly-Supervised_Learning_of_ICCV_2017_paper.pdf
数据集主要是包含一些不常见的主体,动词,目标的三元组关系组合的图片数据集。