MobileNetv1
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
论文地址:[https://arxiv.org/pdf/1704.04861.pdf]
1 Depthwise Separable Convolution
Dk:kernel size Df :feature map size M:input channels
N:output channels
标准卷积
parameters: Dk×Dk×M×N
computation cost: Dk×Dk×M×N×Df×Df
depthwise convolutions and pointwise convolutions
parameters: Dk×Dk×M+N
computation cost: Dk×Dk×M×DF×DF + M×N×DF×DF
dw卷积与标准卷积的比值:
MobileNet uses 3 × 3 depthwise separable convolutions which uses between 8 to 9 times less computation than standard convolutions at only a small reduction in accuracy.
2 Network Structure
如上图所示:MobileNet spends 95% of it’s computation time in 1 × 1 convolutions which also has 75% of the parameters . Nearly all of the additional parametersare in the fully connected layer.
3 Width Multiplier: Thinner Models
4 Resolution Multiplier: Reduced Representation
MobileNetv2
Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation
论文地址:[https://arxiv.org/pdf/1801.04381.pdf]
1 Linear Bottlenecks
当n=2,3等低维度的时候,ReLU会造成大量的信息丢失,而n=15,16 等高维度的时候只会有少量信息丢失。所以把最后一个ReLU6换成了Linear 变换。即linear-bottlenecks
2 Inverted Residuals
v1,v2 结构对比,v2第一个1×1卷积会扩充维度,因为DW卷积不能改变通道数量,如何input channels 很少,则卷积会在很少的维度进行。所以会先扩充维度。
带下采样的bottleneck residual block
3 Model Architecture
MobileNetv3
Searching for MobileNetV3
论文地址:https://arxiv.org/pdf/1905.02244.pdf
1 S-E Block
v3 在v2的基础上加入了注意力模块,而且不同于SENET的是将其加在DW卷积后面,这样SEblock中的channels 会更多。
2 Network Search 这个还没搞明白,后续会更新
2.1 Platform-Aware NAS for Block-wise Search
2.2 NetAdapt for Layer-wise Search
3 Redesigning Expensive Layers对V2最后阶段的修改
在mobilenetv2中,在avg pooling之前,存在一个1x1的卷积层,目的是提高特征图的维度,更有利于结构的预测,但是这其实带来了一定的计算量了,所以这里作者修改了,将其放在avg pooling的后面,首先利用avg pooling将特征图大小由7x7降到了1x1,降到1x1后,然后再利用1x1提高维度,这样就减少了7x7=49倍的计算量。并且为了进一步的降低计算量,作者直接去掉了前面纺锤型卷积的3x3以及1x1卷积,进一步减少了计算量,就变成了如下图第二行所示的结构,作者将其中的3x3以及1x1去掉后,精度并没有得到损失。这里降低了大约15ms的运行时间。
4 h-swish
swish论文的作者认为,Swish具备无上界有下界、平滑、非单调的特点。
swish x = x · σ(x)
把sigmoid 用ReLU(6) 替换变成hard-swish: