[Paper Reading] Aggregating local descriptors into a compact image representation

Paper Site: https://lear.inrialpes.fr/pubs/2010/JDSP10/jegou_compactimagerepresentation.pdf

Problem Definition

To jointly solve the three constraints: the accuracy of the search, the efficiency and the memory usage of the representation in the image search on a very large scale.

Contribution and Discussion

  1. Propose a simple yet efficient way of aggregating local image descriptors into a vector of limited dimension, which can be viewed as a simplification of the Fisher kernel representation.

  2. Jointly optimize the dimension reduction and the indexing algorithm, so that it best preserves the quality of vector comparison.

  3. Significantly outperforms the state of the art: the search accuracy is comparable to the bag-of-features approach for an image representation that fits in 20 bytes. Searching a 10 million image dataset takes about 50ms.

Method

  1. From vectors to codes: optimize 1) a projection that reduces the dimensionality of the vector and 2) a quantization used to index the resulting vectors.

  2. Dimensionality reduction in approximate nearest neighbor search. Use principal component analysis (PCA) for dimensionality reduction.

  3. Allocating different numbers of bits to the different components to balance the components' variance.

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。