Faster-rcnn源码解析6

fast rcnn的网络结构：stage1_fast_rcnn_train.pt

首先来看数据的准备阶段：

layer {

type: 'Python'

top: 'data'

top: 'rois'

top: 'labels'

top: 'bbox_targets'

top: 'bbox_inside_weights'

top: 'bbox_outside_weights'

python_param {

module: 'roi_data_layer.layer'

layer: 'RoIDataLayer'

param_str: "'num_classes': 21"

}

进入roi_data_layer.layer文件查看forward函数：

其实，这个函数我们在train_rpn的时候已经用到过一次了，现在只不过参数设置不同，里面的细节有一些变化：

首先，还是获取blobs：blobs =self._get_next_minibatch()，

下面进入_get_next_minibatch函数：

首先从图片数据中随机抽取两张图片数据：db_inds =self._get_next_minibatch_inds()，在_get_next_minibatch_inds函数中，由于参数cfg.TRAIN.IMS_PER_BATCH=2，所以每次是抽取了两张图片，也就是说我们每次处理的是两张图片（和train_rpn的时候不同，train_rpn抽取的是一张图片）。

回到_get_next_minibatch函数中，下一步获取图片数据：minibatch_db = [self._roidb[i] for i in db_inds]，这是一个列表，最后利用get_minibatch函数返回结果：return get_minibatch(minibatch_db,self._num_classes)

进入get_minibatch函数：

首先，将图片缩放并获取缩放比例：im_scales，然后将列表数据minibatch_db 转化成caffe需要的blob格式：

im_blob, im_scales = _get_image_blob(roidb, random_scale_inds)，这里的im_blob的batch=2。

然后，把im_blob添加到blobs字典：blobs = {'data': im_blob}

因为参数cfg.TRAIN.HAS_RPN=False，因此，这里执行else语句，首先初始化一些空的变量，用于向blobs 字典中添加数据：

rois_blob = np.zeros((0,5),dtype=np.float32)

labels_blob = np.zeros((0),dtype=np.float32)

bbox_targets_blob = np.zeros((0,4 * num_classes),dtype=np.float32)

bbox_inside_blob = np.zeros(bbox_targets_blob.shape,dtype=np.float32)

然后，循环图片列表（其实只有两张图片）并求得labels, overlaps, im_rois, bbox_targets, bbox_inside_weights等变量：

for im_i in range(num_images):

labels, overlaps, im_rois, bbox_targets, bbox_inside_weights \

= _sample_rois(roidb[im_i], fg_rois_per_image, rois_per_image, num_classes)

下面进入_sample_rois函数：

先看一下输入数据：

roidb[im_i]：第im_i张图片的roidb数据

rois_per_image：默认值为64

fg_rois_per_image：默认值为16

num_classes：21

再来说一下_sample_rois函数的作用：主要就是将我们得到的proposals（<=2000个）限制在64个，其中前景proposals的个数<=16个（根据cfg.TRAIN.FG_THRESH，多的话随机抽取），背景proposals的个数大于等于48个，小于等于64个（多的话随机抽取）。当然，在将proposals的个数限制在64个之后，也将这些保留下来的proposals改名为了：rois。

好了，来一下函数的返回结果是什么：

labels = roidb['max_classes']

overlaps = roidb['max_overlaps']

rois = roidb['boxes']

如果保留的rois的索引号为：keep_inds，那么返回的结果为：

labels = labels[keep_inds] # 前景的labels设置为：对应的物体类别号

labels[fg_rois_per_this_image:] =0 # 将背景的labels设置为：0

overlaps = overlaps[keep_inds]

rois = rois[keep_inds]

最后，还有两个需要输入的结果：

bbox_targets, bbox_inside_weights = _get_bbox_regression_labels(

roidb['bbox_targets'][keep_inds, :], num_classes)

其实是由_get_bbox_regression_labels函数的返回值。

下面来看一下_get_bbox_regression_labels函数：

输入是roidb中元素字典的bbox_targets：roidb['bbox_targets'][keep_inds, :]，当然这里的bbox_targets也和rois一样，只保留对应于keep_inds索引的值，还有一个输入是num_classes=21

再开说一下_get_bbox_regression_labels函数的作用：其实就是把roidb['bbox_targets'][keep_inds, :]矩阵，由原来的len(keep_inds)行5列，转变成了len(keep_inds)行84列，而且返回的矩阵bbox_targets在每一行中，只有对应的物体号的那4列的值为非0元素（这4列的取值，其实就是原来的roidb['bbox_targets'][keep_inds, :]矩阵后4列的值），其余80列的值都为0，当然，如果某一行对应的是背景，那么一整行的元素取值都为0。

在_get_bbox_regression_labels函数中，还有一个返回值bbox_inside_weights，这也是一个len(keep_inds)行84列的矩阵，和bbox_targets是对应的，只不过在bbox_targets的取值为非0的取值不同，默认取值为：(1.0, 1.0, 1.0, 1.0)。

下面返回get_minibatch函数，根据_sample_rois函数，得到了一些列的返回值：labels, overlaps, im_rois, bbox_targets 和 bbox_inside_weights。接下来在循环中对这些返回值进行操作：

rois = _project_im_rois(im_rois, im_scales[im_i]) ：把im_rois中的坐标对应到缩放之后的图片上，因为我们的处理都是在缩放之后的图片上进行的，而im_rois的坐标是相对于原图来说的。

然后，给rois在最前面增加1列数据：

batch_ind = im_i * np.ones((rois.shape[0],1))

rois_blob_this_image = np.hstack((batch_ind, rois))

从这里可以看出，rois_blob_this_image 是一个rois.shape[0]行5列的矩阵，如果第1列的元素为0，那么代表的是这个batch中的第1张图片，如果第1列的元素为1，代表的是这个batch中的第2张图片。（其实，关于这里的np.hstack有一个隐患，就是两张图片的rois.shape[0]有可能不相等，这样的话就会报错。当然，除非是极端情况，要不然不可能发生。因为我们得到的proposals的数量够多，2000个，而rois_per_image又比较小，只有64，所以，每张图片rois.shape[0]最终的取值大概率都会是：64）

然后把结果rois_blob_this_image 合并到rois_blob 中：

rois_blob = np.vstack((rois_blob, rois_blob_this_image))，结果两次循环之后，rois_blob 里面就包含了两张图片的数据。

同样的，对其他变量labels_blob、bbox_targets_blob、bbox_inside_blob也进行数据的合并，即：把两张图片的数据合并在一起：

labels_blob = np.hstack((labels_blob, labels))

bbox_targets_blob = np.vstack((bbox_targets_blob, bbox_targets))

bbox_inside_blob = np.vstack((bbox_inside_blob, bbox_inside_weights))

最后，把得到的上述数据添加到blobs字典中：

blobs['rois'] = rois_blob

blobs['labels'] = labels_blob

blobs['bbox_targets'] = bbox_targets_blob

blobs['bbox_inside_weights'] = bbox_inside_blob

blobs['bbox_outside_weights'] = np.array(bbox_inside_blob >0).astype(np.float32) # 由0和1组成的矩阵，其中bbox_inside_blob中不为0的地方对应的位置取值为1，其余地方取值为0

最后，get_minibatch函数返回字典blobs。

然后，回到_get_next_minibatch函数和forward函数，我们得到blobs 字典：blobs =self._get_next_minibatch()。

接下来把blobs字典中的取值取出并传递给top返回，top也即是forward函数的输出结果。

OK，这样我们就完成了数据的准备工作，接下来，把得到的上述数据输入接下来的网络进行传播。

最开始，是5个卷积层，已经见过多次了：

#========= conv1-conv5 ============

layer {

type: "Convolution"

bottom: "data"

top: "conv1"

param { lr_mult: 1.0 }

param { lr_mult: 2.0 }

convolution_param {

num_output: 96

kernel_size: 7

pad: 3

stride: 2

}

layer {

type: "ReLU"

bottom: "conv1"

top: "conv1"

}

layer {

type: "LRN"

bottom: "conv1"

top: "norm1"

lrn_param {

local_size: 3

alpha: 0.00005

beta: 0.75

norm_region: WITHIN_CHANNEL

engine: CAFFE

}

layer {

type: "Pooling"

bottom: "norm1"

top: "pool1"

pooling_param {

kernel_size: 3

stride: 2

pad: 1

pool: MAX

}

layer {

type: "Convolution"

bottom: "pool1"

top: "conv2"

param { lr_mult: 1.0 }

param { lr_mult: 2.0 }

convolution_param {

num_output: 256

kernel_size: 5

pad: 2

stride: 2

}

layer {

type: "ReLU"

bottom: "conv2"

top: "conv2"

}

layer {

type: "LRN"

bottom: "conv2"

top: "norm2"

lrn_param {

local_size: 3

alpha: 0.00005

beta: 0.75

norm_region: WITHIN_CHANNEL

engine: CAFFE

}

layer {

type: "Pooling"

bottom: "norm2"

top: "pool2"

pooling_param {

kernel_size: 3

stride: 2

pad: 1

pool: MAX

}

layer {

type: "Convolution"

bottom: "pool2"

top: "conv3"

param { lr_mult: 1.0 }

param { lr_mult: 2.0 }

convolution_param {

num_output: 384

kernel_size: 3

pad: 1

stride: 1

}

layer {

type: "ReLU"

bottom: "conv3"

top: "conv3"

}

layer {

type: "Convolution"

bottom: "conv3"

top: "conv4"

param { lr_mult: 1.0 }

param { lr_mult: 2.0 }

convolution_param {

num_output: 384

kernel_size: 3

pad: 1

stride: 1

}

layer {

type: "ReLU"

bottom: "conv4"

top: "conv4"

}

layer {

type: "Convolution"

bottom: "conv4"

top: "conv5"

param { lr_mult: 1.0 }

param { lr_mult: 2.0 }

convolution_param {

num_output: 256

kernel_size: 3

pad: 1

stride: 1

}

layer {

type: "ReLU"

bottom: "conv5"

top: "conv5"

}

接下来是ROI池化层：

roi层的代码是在fast rcnn中的cpp文件中。

roi层之后，在接两个全连接层：

layer {

type: "InnerProduct"

bottom: "roi_pool_conv5"

top: "fc6"

param { lr_mult: 1.0 }

param { lr_mult: 2.0 }

inner_product_param {

num_output: 4096

}

layer {

type: "ReLU"

bottom: "fc6"

top: "fc6"

}

layer {

type: "Dropout"

bottom: "fc6"

top: "fc6"

dropout_param {

dropout_ratio: 0.5

scale_train: false

}

layer {

type: "InnerProduct"

bottom: "fc6"

top: "fc7"

param { lr_mult: 1.0 }

param { lr_mult: 2.0 }

inner_product_param {

num_output: 4096

}

layer {

type: "ReLU"

bottom: "fc7"

top: "fc7"

}

layer {

type: "Dropout"

bottom: "fc7"

top: "fc7"

dropout_param {

dropout_ratio: 0.5

scale_train: false

}

紧接着fc7分两个方向，一个方向预测：物体类别，另一个方向预测：box的坐标。

layer {

type: "InnerProduct"

bottom: "fc7"

top: "cls_score"

param { lr_mult: 1.0 }

param { lr_mult: 2.0 }

inner_product_param {

num_output: 21

weight_filler {

type: "gaussian"

std: 0.01

}

bias_filler {

type: "constant"

value: 0

}

这个是用了一个全连接层来预测类别，输出结果为：cls_score。

layer {

type: "InnerProduct"

bottom: "fc7"

top: "bbox_pred"

param { lr_mult: 1.0 }

param { lr_mult: 2.0 }

inner_product_param {

num_output: 84

weight_filler {

type: "gaussian"

std: 0.001

}

bias_filler {

type: "constant"

value: 0

}

同样是用来一个全连接层来预测box的坐标，输出结果为：bbox_pred。

最后，根据 cls_score 和 bbox_pred 来计算loss：

layer {

type: "SoftmaxWithLoss"

bottom: "cls_score"

bottom: "labels"

propagate_down: 1

propagate_down: 0

top: "cls_loss"

loss_weight: 1

loss_param {

ignore_label: -1

normalize: true

}

layer {

type: "SmoothL1Loss"

bottom: "bbox_pred"

bottom: "bbox_targets"

bottom: "bbox_inside_weights"

bottom: "bbox_outside_weights"

top: "bbox_loss"

loss_weight: 1

}

最后，回到train_fast_rcnn函数：

得到了训练的fast rcnn的网络，保存在model_paths列表中，然后，移除model_paths列表中保存的网络文件，只保留最新的网络：

for iin model_paths[:-1]:

os.remove(i)

把列表中剩余的唯一一个元素保存在fast_rcnn_model_path变量中：fast_rcnn_model_path = model_paths[-1]

把fast_rcnn_model_path 以字典的形式推入的子进程的队列中：

queue.put({'model_path': fast_rcnn_model_path})

到这里，创建子进程结束：p = mp.Process(target=train_fast_rcnn,kwargs=mp_kwargs)

下面，启动进程：p.start()

从子进程队列中获取训练得到的fast rcnn的网络：fast_rcnn_stage1_out = mp_queue.get()

等待子进程结束：p.join()

最后编辑于：2018.05.04 19:43:04

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 194,911评论 5赞 460
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 82,014评论 2赞 371
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 142,129评论 0赞 320
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 52,283评论 1赞 264
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 61,159评论 4赞 357
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 46,161评论 1赞 272
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 36,565评论 3赞 382
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 35,251评论 0赞 253
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 39,531评论 1赞 292
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 34,619评论 2赞 310
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 36,383评论 1赞 326
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 32,255评论 3赞 313
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 37,624评论 3赞 299
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 28,916评论 0赞 17
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 30,199评论 1赞 250
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 41,553评论 2赞 342
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 40,756评论 2赞 335

Faster-rcnn源码解析6

推荐阅读更多精彩内容