图像特征Dense Scale Invariant Feature Transform (DSIFT)

Table of Contents

Overview
Usage
Technical detailsDense descriptors
Sampling

Author:
Andrea Vedaldi
Brian Fulkerson

dsift.h implements a dense version of SIFT. This is an object that can quickly compute descriptors for densely sampled keypoints with identical size and orientation. It can be reused for multiple images of the same size.
Overview
See also
The SIFT module, Technical details

This module implements a fast algorithm for the calculation of a large number of SIFT descriptors of densely sampled features of the same scale and orientation. See the SIFT module for an overview of SIFT.
The feature frames (keypoints) are indirectly specified by the sampling steps (vl_dsift_set_steps) and the sampling bounds (vl_dsift_set_bounds). The descriptor geometry (number and size of the spatial bins and number of orientation bins) can be customized (vl_dsift_set_geometry, VlDsiftDescriptorGeometry).

dsift-geom.png
Dense SIFT descriptor geometry

By default, SIFT uses a Gaussian windowing function that discounts contributions of gradients further away from the descriptor centers. This function can be changed to a flat window by invoking vl_dsift_set_flat_window. In this case, gradients are accumulated using only bilinear interpolation, but instad of being reweighted by a Gassuain window, they are all weighted equally. However, after gradients have been accumulated into a spatial bin, the whole bin is reweighted by the average of the Gaussian window over the spatial support of that bin. This “approximation” substantially improves speed with little or no loss of performance in applications.
Keypoints are sampled in such a way that the centers of the spatial bins are at integer coordinates within the image boundaries. For instance, the top-left bin of the top-left descriptor is centered on the pixel (0,0). The bin immediately to the right at (binSizeX
,0), where binSizeX
is a paramtere in the VlDsiftDescriptorGeometry structure. vl_dsift_set_bounds can be used to further restrict sampling to the keypoints in an image.
Usage
DSIFT is implemented by a VlDsiftFilter object that can be used to process a sequence of images of a given geometry. To use the DSIFT filter:
Initialize a new DSIFT filter object by vl_dsift_new (or the simplified vl_dsift_new_basic). Customize the descriptor parameters byvl_dsift_set_steps, vl_dsift_set_geometry, etc.
Process an image by vl_dsift_process.
Retrieve the number of keypoints (vl_dsift_get_keypoint_num), the keypoints (vl_dsift_get_keypoints), and their descriptors (vl_dsift_get_descriptors).
Optionally repeat for more images.
Delete the DSIFT filter by vl_dsift_delete.

Technical details
This section extends the SIFT descriptor section and specialzies it to the case of dense keypoints.
Dense descriptors
When computing descriptors for many keypoints differing only by their position (and with null rotation), further simplifications are possible. In this case, in fact,
xh(t,i,j)==mσx+T,mσ∫gσwin(x−T)wang(∠J(x)−θt)w(x−Txmσ−xi)w(y−Tymσ−y^j)|J(x)|dx.

Since many different values of T are sampled, this is conveniently expressed as a separable convolution. First, we translate by xij=mσ(x^i, y^i)⊤
and we use the symmetry of the various binning and windowing functions to write
h(t,i,j)T′==mσ∫gσwin(T′−x−xij)wang(∠J(x)−θt)w(T′x−xmσ)w(T′y−ymσ)|J(x)|dx,T+mσ[xiyj].

Then we define kernels
ki(x)kj(y)==12π−−√σwinexp(−12(x−xi)2σ2win)w(xmσ),12π−−√σwinexp(−12(y−yj)2σ2win)w(ymσ),

and obtain
h(t,i,j)J¯t(x)==(kikj∗J¯t)(T+mσ[xiyj]),wang(∠J(x)−θt)|J(x)|.

Furthermore, if we use a flat rather than Gaussian windowing function, the kernels do not depend on the bin, and we have
k(z)h(t,i,j)==1σwinw(zmσ),(k(x)k(y)∗J¯t)(T+mσ[xiyj]),

(here σwin
is the side of the flat window).
Note
In this case the binning functions k(z)
are triangular and the convolution can be computed in time independent on the filter (i.e. descriptor bin) support size by integral signals.

Sampling
To avoid resampling and dealing with special boundary conditions, we impose some mild restrictions on the geometry of the descriptors that can be computed. In particular, we impose that the bin centers T+mσ(xi, yj)
are always at integer coordinates within the image boundaries. This eliminates the need for costly interpolation. This condition amounts to (expressed in terms of the x coordinate, and equally applicable to y)
{0,…,W−1}∋Tx+mσxi=Tx+mσi−Nx−12=T¯x+mσi,i=0,…,Nx−1.

Notice that for this condition to be satisfied, the descriptor center Tx
needs to be either fractional or integer depending on Nx
being even or odd. To eliminate this complication, it is simpler to use as a reference not the descriptor center T, but the coordinates of the upper-left bin T¯
. Thus we sample the latter on a regular (integer) grid
[00]≤T¯=[T¯minx+pΔxT¯miny+qΔy]≤[W−1−mσNxH−1−mσNy],T¯=⎡⎣Tx−Nx−12Ty−Ny−12⎤⎦

and we impose that the bin size mσ
is integer as well.
from VLFeat.

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • **2014真题Directions:Read the following text. Choose the be...
    又是夜半惊坐起阅读 9,811评论 0 23
  • 文/小思凡 北京丰台家, 8:35——10:12 『百问导师』 1.怎么刚刚就上了一张图,感觉内心的感觉变了?这里...
    任思凡阅读 344评论 0 0
  • 01 之所以选择步入婚姻,定是做好了要与对方相守到老的准备。 婚姻和爱情不同,前者是后者的感情升华,却也是后者的精...
    丑妹阅读 1,124评论 3 20
  • 今天没讲新的内容,老师布置了任务,我们把之前所讲的内容进行练习 1.用key1键控制LED灯的开关,每按一次亮一种...
    我叫赵健阅读 202评论 0 0
  • 一、可迭代对象和迭代器 1.迭代的概念 上一次输出的结果为下一次输入的初始值,重复的过程称为迭代,每次重复即一次迭...
    chen_000阅读 335评论 0 1