1.前言
随着AlexNet的大放异彩,人工智能er开始寻求网络架构的精度和深度。牛津大学计算机视觉组(Visual Geometry Group)和Google的论文Deep Mind便是对卷积网络深度的探索,此网络被命名为VGG。
VGG的结构很清晰,非常容易扩展到其他的数据集,也能应用到MNIST,CIFAR等图像尺寸小的数据集。
VGG的效果也很好,而且模型开源,适合迁移学习,也产生了VGG的各种变种,形成了VGG家族。
2.介绍
图1 VGG家族
VGG家族的特点:
- 输入为244×244.
- 都有五层池化层,且都采用了maxpool,pool_size=2×2,strides=2,这样,每经过一次池化,Feature Map便会➗2,最终变为7×7.
- 最后都有两个节点数为4096的全连接层和节点数为1000的全连接层,激活函数为soft-max。
- 卷积层和节点数为4096的全连接层的激活函数都是relu。
- 使用了3×3的卷积核,部分使用了1×1的卷积核,深度更深,提升了决策函数的非线性化。
-
神经元数量和网络层数更多,训练速度更慢。
图2 VGG家族的性能表现
通过图2可以发现,随着网络层数的增多,错误率逐渐下降,但是,当增加到16层,也就是VGG-C和VGG-D时,网络精度趋于饱和。虽然层数为19层的VGG-E错误率有所下降,但是训练时间很长。(残差网络通过shaortcut机制将网络深度理论上扩展到了无限大)
3.代码
keras实现模型代码:
这里只给出了VGG-A,VGG-B和VGG-E的模型结构代码实现,其他的实现方法与此类似。
from keras.layers import Flatten, Conv2D, MaxPool2D, Dense, Dropout, Input, Activation
from keras.models import Model
# VGG-A
def vgg_a(x):
x1 = Conv2D(64, (3, 3), padding='same', activation='relu')(x)
x1 = MaxPool2D(pool_size=(2, 2), strides=2)(x1)
x1 = Conv2D(128, (3, 3), padding='same', activation='relu')(x1)
x1 = MaxPool2D(pool_size=(2, 2), strides=2)(x1)
x1 = Conv2D(256, (3, 3), padding='same', activation='relu')(x1)
x1 = Conv2D(256, (3, 3), padding='same', activation='relu')(x1)
x1 = MaxPool2D(pool_size=(2, 2), strides=2)(x1)
x1 = Conv2D(512, (3, 3), padding='same', activation='relu')(x1)
x1 = Conv2D(512, (3, 3), padding='same', activation='relu')(x1)
x1 = MaxPool2D(pool_size=(2, 2), strides=2)(x1)
x1 = Conv2D(512, (3, 3), padding='same', activation='relu')(x1)
x1 = Conv2D(512, (3, 3), padding='same', activation='relu')(x1)
x1 = MaxPool2D(pool_size=(2, 2), strides=2)(x1)
x1 = Dense(4096, activation='relu')(x1)
x1 = Dropout(0.5)(x1)
x1 = Dense(4096, activation='relu')(x1)
x1 = Dropout(0.5)(x1)
x1 = Dense(10)(x1)
output = Activation('softmax')(x1)
return output
input_img = Input(shape=(224, 224, 1))
VGG_A = Model(input_img, vgg_a(input_img))
VGG_A.summary()
# VGG-B
def vgg_b(x):
x1 = Conv2D(64, (3, 3), padding='same', activation='relu')(x)
x1 = Conv2D(64, (3, 3), padding='same', activation='relu')(x1)
x1 = MaxPool2D(pool_size=(2, 2), strides=2)(x1)
x1 = Conv2D(128, (3, 3), padding='same', activation='relu')(x1)
x1 = Conv2D(128, (3, 3), padding='same', activation='relu')(x1)
x1 = MaxPool2D(pool_size=(2, 2), strides=2)(x1)
x1 = Conv2D(256, (3, 3), padding='same', activation='relu')(x1)
x1 = Conv2D(256, (3, 3), padding='same', activation='relu')(x1)
x1 = MaxPool2D(pool_size=(2, 2), strides=2)(x1)
x1 = Conv2D(512, (3, 3), padding='same', activation='relu')(x1)
x1 = Conv2D(512, (3, 3), padding='same', activation='relu')(x1)
x1 = MaxPool2D(pool_size=(2, 2), strides=2)(x1)
x1 = Conv2D(512, (3, 3), padding='same', activation='relu')(x1)
x1 = Conv2D(512, (3, 3), padding='same', activation='relu')(x1)
x1 = MaxPool2D(pool_size=(2, 2), strides=2)(x1)
x1 = Dense(4096, activation='relu')(x1)
x1 = Dropout(0.5)(x1)
x1 = Dense(4096, activation='relu')(x1)
x1 = Dropout(0.5)(x1)
x1 = Dense(10)(x1)
output = Activation('softmax')(x1)
return output
VGG_B = Model(input_img, vgg_b(input_img))
VGG_B.summary()
# VGG-E
def vgg_e(x):
x1 = Conv2D(64, (3, 3), padding='same', activation='relu')(x)
x1 = Conv2D(64, (3, 3), padding='same', activation='relu')(x1)
x1 = MaxPool2D(pool_size=(2, 2), strides=2)(x1)
x1 = Conv2D(128, (3, 3), padding='same', activation='relu')(x1)
x1 = Conv2D(128, (3, 3), padding='same', activation='relu')(x1)
x1 = MaxPool2D(pool_size=(2, 2), strides=2)(x1)
x1 = Conv2D(256, (3, 3), padding='same', activation='relu')(x1)
x1 = Conv2D(256, (3, 3), padding='same', activation='relu')(x1)
x1 = Conv2D(256, (3, 3), padding='same', activation='relu')(x1)
x1 = Conv2D(256, (3, 3), padding='same', activation='relu')(x1)
x1 = MaxPool2D(pool_size=(2, 2), strides=2)(x1)
x1 = Conv2D(512, (3, 3), padding='same', activation='relu')(x1)
x1 = Conv2D(512, (3, 3), padding='same', activation='relu')(x1)
x1 = Conv2D(512, (3, 3), padding='same', activation='relu')(x1) # 提升效果大于1×1
x1 = Conv2D(512, (3, 3), padding='same', activation='relu')(x1)
x1 = MaxPool2D(pool_size=(2, 2), strides=2)(x1)
x1 = Conv2D(512, (3, 3), padding='same', activation='relu')(x1)
x1 = Conv2D(512, (3, 3), padding='same', activation='relu')(x1)
x1 = Conv2D(512, (3, 3), padding='same', activation='relu')(x1)
x1 = Conv2D(512, (3, 3), padding='same', activation='relu')(x1)
x1 = MaxPool2D(pool_size=(2, 2), strides=2)(x1)
x1 = Flatten()(x1)
x1 = Dense(4096, activation='relu')(x1)
x1 = Dropout(0.5)(x1)
x1 = Dense(4096, activation='relu')(x1)
x1 = Dropout(0.5)(x1)
x1 = Dense(10)(x1)
output = Activation('softmax')(x1)
return output
VGG_E = Model(input_img, vgg_e(input_img))
VGG_E.summary()