Mobilenet-Efficient Convolutional Neural Networks for Mobile Vision Applications

摘要

MobileNet网络一种针对移动端以及嵌入式视觉应用的轻量网络结构，作者来自Google。其贡献在于使用深度可分离卷积和1x1卷积代替传统的2d图像卷积，来构造轻型权重深度神经网络。在资源和准确率的权衡方面做了大量的实验并且相较于其他在ImageNet分类任务上著名的模型有很好的表现。

网络由来

网络小型化方法：
（1）卷积核分解，使用1×N和N×1的卷积核代替N×N的卷积核
（2）使用bottleneck结构，以SqueezeNet为代表
（3）以低精度浮点数保存，例如Deep Compression
（4）冗余卷积核剪枝及哈弗曼编码
传统3D图像卷积
传统的3D图像卷积指的是一个多通道的图像(假设图像通道数为M，C>1) 和N个KxK的卷积核（这N个卷积核互不相同）做卷积，即深度神经网络中的卷积层所作的事情。卷积的流程是：
（1）对于其中一个卷积核，分别与图像的每个通道做2d图像卷积，M个通道得到M个卷积结果；
（2）对这M个卷积结果按元素求和，得到一张求和结果；
（3）对N个卷积核中的每个核，重复步骤(1)-(2)，即得到一个通道为N的卷积结果，即featuremap。
该过程的计算量为：((K x K x H x W) x M )x N

标准3D图像卷积
深度可分离卷积
深度可分离卷积(depthwith conv) 实际上就是传统3D卷积过程中的步骤(1)，将1个卷积核分别与每个通道进行卷积，得到M个卷积结果

深度可分离卷积
1x1 卷积
1x1卷积最初来自于Network in Network 网络，主要用于通道压缩上，即改变特征图的通道数，后也用来代替全连接层以减少计算量。
对于一个通道数为M的图像，N个1x1卷积的计算量为：((1 x 1 x H x W) x M) x N

image.png
mobilenet方案
对于一个标准的卷积操作，可用一个深度可分离卷积核一个1x1卷积替代，其计算量为：
(K x K x H x W) x M + ((1 x 1 x H x W) x M) x N = H x W x (K x K + N) x M

相比与标准的卷积操作，其计算量减少量为：
(H x W x M x (K x K + N) )/(H x W x K x K x M x N) = 1/N + 1/(K x K)

网络结构

Mobilenet的网络结构非常简单，第一层采用3x3标准的卷积层(stride=2)，其后采用深度可分离卷积和1x1卷积(即conv_dw + conv1x1)作为基础单元，若干个这样的基础单元串联起来形成不同深度的网络，其中每个1x1卷积后都连接relu激活和batch norm，在conv_dw通过设置stride=2进行下采样。最后采用average pooling代替全连接，ImageNet分类任务上采用一层全连接(units=1000) 和softmax输出类别和置信概率。

mobile net structs 
-------------------------------------------------- 
layer        | kh x kw, out, s | out size 
-------------------------------------------------- 
         input image (224 x 224 x3)
-------------------------------------------------- 
conv         | 3x3, 32, 2      | 112x112x32
-------------------------------------------------- 
conv_dw      | 3x3, 32dw, 1    | 112x112x32 
conv1x1      | 1x1, 64, 1      | 112x112x64
-------------------------------------------------- 
conv_dw      | 3x3, 64dw, 2    | 56x56x64 
conv1x1      | 1x1, 128, 1     | 56x56x128
-------------------------------------------------- 
conv_dw      | 3x3, 128dw, 1   | 56x56x128 
conv1x1      | 1x1, 128, 1     | 56x56x128 
-------------------------------------------------- 
conv_dw      | 3x3, 128dw, 2   | 28x28x128 
conv1x1      | 1x1, 256, 1     | 28x28x128
-------------------------------------------------- 
conv_dw      | 3x3, 256dw, 1   | 28x28x256 
conv1x1      | 1x1, 256, 1     | 28x28x256 
-------------------------------------------------- 
conv_dw      | 3x3, 256dw, 2   | 14x14x256 
conv1x1      | 1x1, 512, 1     | 14x14x512
-------------------------------------------------- 
5x 
conv_dw      | 3x3, 512dw, 1   | 14x14x512 
conv1x1      | 1x1, 512, 1     | 14x14x512 
-------------------------------------------------- 
conv_dw      | 3x3, 512dw, 2   | 7x7x512 
conv1x1      | 1x1, 1024, 1    | 7x7x1024 
-------------------------------------------------- 
conv_dw      | 3x3, 1024dw, 1  | 7x7x1024 
conv1x1      | 1x1, 1024, 1    | 7x7x1024 
-------------------------------------------------- 
Avg Pool      | 7x7, 1          | 1x1x1024 
FC           | 1024, 1000      | 1x1x1000 
Softmax      | Classifier      | 1x1x1000
--------------------------------------------------

代码实现

本文采用tensorflow.contrib.layers 模块来构建Mobilenet网络结构，关于tf.nn，tf.layers等api的构建方式参见VGG网络中的相关代码。

# --------------------------Method 1 --------------------------------------------
import tensorflow.contrib.layers as tcl
from tensorflow.contrib.framework import arg_scope


class Mobilenet:
    def __init__(self, resolution_inp=224, channel=3, name='resnet50'):
        self.name = name
        self.channel = channel
        self.resolution_inp = resolution_inp

    def _depthwise_separable_conv(self, x, num_outputs, kernel_size=3, stride=1, scope=None):
        with tf.variable_scope(scope, "dw_blk"):
            dw_conv = tcl.separable_conv2d(x, num_outputs=None,
                                           kernel_size=kernel_size,
                                           stride=stride,
                                           depth_multiplier=1)
            conv_1x1 = tcl.conv2d(dw_conv, num_outputs=num_outputs, kernel_size=1, stride=1)
            return conv_1x1

    def __call__(self, x, dropout=0.5, is_training=True):
        with tf.variable_scope(self.name) as scope:
            with arg_scope([tcl.batch_norm], is_training=is_training, scale=True):
                with arg_scope([tcl.conv2d, tcl.separable_conv2d],
                               activation_fn=tf.nn.relu,
                               normalizer_fn=tcl.batch_norm,
                               padding="SAME"):
                    conv1 = tcl.conv2d(x, 32, kernel_size=3, stride=2)

                    y = self._depthwise_separable_conv(conv1, 64, 3, stride=1)
                    y = self._depthwise_separable_conv(y, 128, 3, stride=2)

                    y = self._depthwise_separable_conv(y, 128, 3, stride=1)
                    y = self._depthwise_separable_conv(y, 256, 3, stride=2)

                    y = self._depthwise_separable_conv(y, 256, 3, stride=1)
                    y = self._depthwise_separable_conv(y, 512, 3, stride=2)

                    y = self._depthwise_separable_conv(y, 512, 3, stride=1)
                    y = self._depthwise_separable_conv(y, 512, 3, stride=1)
                    y = self._depthwise_separable_conv(y, 512, 3, stride=1)
                    y = self._depthwise_separable_conv(y, 512, 3, stride=1)
                    y = self._depthwise_separable_conv(y, 512, 3, stride=1)

                    print("y", y)
                    y = self._depthwise_separable_conv(y, 512, 3, stride=2)
                    y = self._depthwise_separable_conv(y, 512, 3, stride=1)

                    avg_pool = tcl.avg_pool2d(y, 7, stride=1)
                    flatten = tf.layers.flatten(avg_pool)

                    self.fc6 = tf.layers.dense(flatten, units=1000, activation=tf.nn.relu)
                    # dropout = tf.nn.dropout(fc6, keep_prob=0.5)
                    predictions = tf.nn.softmax(self.fc6)

                    return predictions

运行

该部分代码包含2部分：计时函数time_tensorflow_run接受一个tf.Session变量和待计算的tensor以及相应的参数字典和打印信息, 统计执行该tensor100次所需要的时间(平均值和方差)；主函数 run_benchmark中初始化了vgg16的3种调用方式，分别统计3中网络在推理(predict) 和梯度计算(后向传递)的时间消耗，详细代码如下：

# -------------------------- Demo and Test --------------------------------------------
batch_size = 16
num_batches = 100
import time
import math
from datetime import datetime


def time_tensorflow_run(session, target, feed, info_string):
    """
    calculate time for each session run
    :param session: tf.Session
    :param target: opterator or tensor need to run with session
    :param feed: feed dict for session
    :param info_string: info message for print
    :return: 
    """
    num_steps_burn_in = 10  # 预热轮数
    total_duration = 0.0  # 总时间
    total_duration_squared = 0.0  # 总时间的平方和用以计算方差
    for i in range(num_batches + num_steps_burn_in):
        start_time = time.time()
        _ = session.run(target, feed_dict=feed)

        duration = time.time() - start_time

        if i >= num_steps_burn_in:  # 只考虑预热轮数之后的时间
            if not i % 10:
                print('[%s] step %d, duration = %.3f' % (datetime.now(), i - num_steps_burn_in, duration))
            total_duration += duration
            total_duration_squared += duration * duration

    mn = total_duration / num_batches  # 平均每个batch的时间
    vr = total_duration_squared / num_batches - mn * mn  # 方差
    sd = math.sqrt(vr)  # 标准差
    print('[%s] %s across %d steps, %.3f +/- %.3f sec/batch' % (datetime.now(), info_string, num_batches, mn, sd))


# test demo
def run_benchmark():
    """
    main function for test or demo
    :return: 
    """
    with tf.Graph().as_default():
        image_size = 224  # 输入图像尺寸
        images = tf.Variable(tf.random_normal([batch_size, image_size, image_size, 3], dtype=tf.float32, stddev=1e-1))

        # method 0
        # prediction, fc = resnet50(images, training=True)
        model = Mobilenet(224, 3)
        prediction = model(images, is_training=True)
        fc = model.fc6

        params = tf.trainable_variables()

        for v in params:
            print(v)
        init = tf.global_variables_initializer()

        print("out shape ", prediction)
        sess = tf.Session()
        print("init...")
        sess.run(init)

        print("predict..")
        writer = tf.summary.FileWriter("./logs")
        writer.add_graph(sess.graph)
        time_tensorflow_run(sess, prediction, {}, "Forward")

        # 用以模拟训练的过程
        objective = tf.nn.l2_loss(fc)  # 给一个loss
        grad = tf.gradients(objective, params)  # 相对于loss的 所有模型参数的梯度

        print('grad backword')
        time_tensorflow_run(sess, grad, {}, "Forward-backward")
        writer.close()


if __name__ == '__main__':
    run_benchmark()

注: 完整代码可参见个人github工程

参数量

与其他网络结构对比

时间效率

参考

https://blog.csdn.net/wfei101/article/details/78310226
https://blog.csdn.net/u013709270/article/details/78722985
1x1卷积

最后编辑于：2018.11.24 18:46:18

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 222,252评论 6赞 516
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 94,886评论 3赞 399
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 168,814评论 0赞 361
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 59,869评论 1赞 299
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 68,888评论 6赞 398
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 52,475评论 1赞 312
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 41,010评论 3赞 422
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 39,924评论 0赞 277
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 46,469评论 1赞 319
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 38,552评论 3赞 342
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 40,680评论 1赞 353
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 36,362评论 5赞 351
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 42,037评论 3赞 335
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 32,519评论 0赞 25
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 33,621评论 1赞 274
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 49,099评论 3赞 378
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 45,691评论 2赞 361

Mobilenet-Efficient Convolutional Neural Networks for Mobile Vision Applications

摘要

网络由来

网络结构

代码实现

运行

参数量

时间效率

参考

推荐阅读更多精彩内容