深度学习（三）：卷积神经网络（下）

一、几种典型的卷积神经网络

1.1、Lenet-5

$Lenet-5$ 出自论文Gradient-Based Learning Applied to Document Recognition，是Yann LeCun 1998年提出的一种用于手写体字符识别的非常高效的卷积神经网络，对MNIST数据集的分识别准确度可达99.2%。。 $Lenet-5$ 的网络结构如下图所示：

Lenet-5 结构示意图

1.2、AlexNet

$AlexNet$ 是第一个现代深度卷积网络模型，其首次使用了很多现代深度卷积网络的一些技术方法，比如使用GPU进行并行训练，采用了ReLU作为非线性激活函数，使用Dropout防止过拟合，使用数据增强来提高模型准确率等。 $AlexNet$ 赢得了2012年ImageNet图像分类竞赛的冠军。 $AlexNet$ 的结构如下图所示，包括5个卷积层、 3个全连接层和1个softmax 层。

AlexNet 结构示意图

1.3、VGGNet

$VGGNet$ 是牛津大学计算机视觉组（Visual Geometry Group）和Google DeepMind公司的研究员一起研发的深度卷积神经网络。 $VGGNet$ 探索了卷积神经网络的深度与其性能之间的关系，通过反复堆叠33的小型卷积核和22的最大池化层， $VGGNet$ 成功地构筑了16~19层深的卷积神经网络。 $VGGNet$ 相比之前state-of-the-art的网络结构，错误率大幅下降， $VGGNet$ 论文中全部使用了33的小型卷积核和22的最大池化核，通过不断加深网络结构来提升性能。

VGG-16 结构示意图

1.4、GoogLeNet

在卷积网络中，如何设置卷积层的卷积核大小是一个十分关键的问题。在 Inception网络中，一个卷积层包含多个不同大小的卷积操作，称为Inception模块。Inception网络是由有多个inception模块和少量的汇聚层堆叠而成。Inception模块同时使用1×1、3×3、5×5等不同大小的卷积核，并将得到的特征映射在深度上拼接（堆叠）起来作为输出特征映射。
GoogLeNet由9个Inceptionv1模块和5个汇聚层以及其它一些卷积层和全连接层构成，总共为22层网络，如下图所示。为了解决梯度消失问题，GoogLeNet 在网络中间层引入两个辅助分类器来加强监督信息。

Inception v1 的模块结构

GoogLeNet 网络结构

1.5、ResNet

残差网络（Residual Network，ResNet）是通过给非线性的卷积层增加直连边的方式来提高信息的传播效率。
假设在一个深度网络中，期望一个非线性单元（可以为一层或多层的卷积层） $f(x,θ)$ 去逼近一个目标函数为 $h(x)$ 。如果将目标函数拆分成两部分：恒等函数（Identity Function） $x$ 和残差函数（Residue Function） $h(x)−x$ 。

下图给出了一个典型的残差单元示例。残差单元由多个级联的（等长）卷积层和一个跨层的直连边组成，再经过ReLU激活后得到输出。残差网络就是将很多个残差单元串联起来构成的一个非常深的网络。

简单的残差单元结构

34层普通卷积网络与残差网络的对比

二、ResNet的手动实现

库和数据的导入。这次采用CIFAR10数据集，CIFAR-10是一个更接近普适物体的彩色图像数据集。一共包含10 个类别的RGB 彩色图片：飞机（ airplane ）、汽车（ automobile ）、鸟类（ bird ）、猫（ cat ）、鹿（ deer ）、狗（ dog ）、蛙类（ frog ）、马（ horse ）、船（ ship ）和卡车（ truck ）。每个图片的尺寸为32 × 32 ，每个类别有6000个图像，数据集中一共有50000 张训练图片和10000 张测试图片。

import torch 
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
from sklearn.manifold import TSNE
import numpy as np
from matplotlib import cm
import matplotlib.pyplot as plt

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

num_epochs = 30
batch_size = 100
learning_rate = 0.001
transform = transforms.Compose([
        transforms.Pad(4),
        transforms.RandomHorizontalFlip(),
        transforms.RandomCrop(32),
        transforms.ToTensor()])

train_dataset = torchvision.datasets.CIFAR10(root='Data', train=True, transform = transform, download = True)
test_dataset = torchvision.datasets.CIFAR10(root='Data', train=False, transform = transforms.ToTensor(), download = False)
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size = batch_size, shuffle= True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size = batch_size, shuffle= False)

CIFAR10举例

由于batch_size为100，因此训练集共分为500组，测试集共分为100组。

print('train size:{}, test size:{}'.format(len(train_loader),len(test_loader)))
#train size:500, test size:100

可视化其中部分图片如下：

from torchvision.transforms import ToPILImage
show = ToPILImage() # 可以把Tensor转成Image，方便可视化
fig = plt.figure()
fig.subplots_adjust(left=0,right=1,bottom=0,top=0.8,hspace=0.2,wspace=0.1)
for i in range(6):
    (image,label) = test_dataset[i]
    ax = fig.add_subplot(2,3,i+1,xticks=[],yticks=[])
    plt.title('{}'.format(classes[label]))
    ax.imshow(show(image))

残差网络的设计。分为残差网络和残差单元两部分。

class ResidualBlock(nn.Module):
    def __init__(self,in_channel,out_channel,stride=1,downsample=None):
        super(ResidualBlock,self).__init__()
        self.conv1 = nn.Conv2d(in_channel,out_channel,kernel_size=3,stride=stride,padding=1)
        self.bn1 = nn.BatchNorm2d(out_channel)
        self.relu = nn.ReLU()
        self.conv2 = nn.Conv2d(out_channel,out_channel,kernel_size=3,stride=1,padding=1)
        self.bn2 = nn.BatchNorm2d(out_channel)
        self.downsample = downsample
        
    def forward(self,x):
        residual = x
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)
        out = self.conv2(out)
        out = self.bn2(out)
        
        if self.downsample:
            residual = self.downsample(x)
        out += residual
        out = self.relu(out)
        return out
    
class ResNet(nn.Module):
    def __init__(self,block,num_classes=10):
        super(ResNet, self).__init__()
        self.in_channel = 16
        self.conv1 = nn.Conv2d(3,16, stride =1, kernel_size = 3, padding = 1)
        self.bn = nn.BatchNorm2d(16)
        self.relu = nn.ReLU()
        
        self.block1 = self.make_layer(block,16,1)
        self.block2 = self.make_layer(block,16,1)
        self.block3 = self.make_layer(block,32,2)
        self.block4 = self.make_layer(block,32,1)
        self.block5 = self.make_layer(block,64,2)
        self.block6 = self.make_layer(block,64,1)
        self.avg_pool = nn.AvgPool2d(8)
        self.fc = nn.Linear(64,num_classes)
        
    def make_layer(self,block,out_channel,stride=1):
        downsample = None
        if (stride != 1) or (self.in_channel != out_channel):
            downsample = nn.Sequential(
                    nn.Conv2d(self.in_channel,out_channel,kernel_size=3,stride=stride,padding=1),
                    nn.BatchNorm2d(out_channel))
        out_layer = block(self.in_channel, out_channel, stride, downsample)
        self.in_channel = out_channel
        return out_layer
    
    def forward(self,x):
        out = self.conv1(x)
        out = self.bn(out)
        out = self.relu(out)
        out = self.block1(out)
        out = self.block2(out)
        out = self.block3(out)
        out = self.block4(out)
        out = self.block5(out)
        out = self.block6(out)
        out = self.avg_pool(out)
        out = out.view(out.size(0), -1)
        out = self.fc(out)
        return out

网络训练即测试过程。

resnet = ResNet(ResidualBlock).to(device)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(resnet.parameters(),lr=learning_rate)

def update_lr(optimizer,lr):
    for para in optimizer.param_groups:
        para['lr'] = lr
        
total_step = len(train_loader)
curr_lr = learning_rate
for epoch in range(num_epochs):
    for idx,(images,labels) in enumerate(train_loader):
        images = images.to(device)
        labels = labels.to(device)
        #print(images.shape)
        outputs = resnet(images)
        loss = criterion(outputs,labels)
        
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        if (idx+1)%100 == 0:
            print ("Epoch [{}/{}], Step [{}/{}] Loss: {:.4f}".format(epoch+1, num_epochs, idx+1, total_step, loss.item()))

    # Decay learning rate
    if (epoch+1) % 20 == 0:
        curr_lr /= 3
        update_lr(optimizer, curr_lr)

with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)
        outputs = resnet(images)
        predicted = torch.argmax(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum()
    print('Test Accuracy of the model on the 10000 test images: {} %'.format(100 * correct / total))

可见在测试集上准确率为84 %，说明了网络的有效性。
Test Accuracy of the model on the 10000 test images: 84 %

分类结果的可视化。

classes = ('plane', 'car', 'bird', 'cat','deer', 'dog', 'frog', 'horse', 'ship', 'truck')
#visualization of trained flatten layer (t-SNE)
tsne = TSNE(perplexity=30,n_components=2,init='pca',n_iter=5000)
plot_only = 500
low_dim_embs = tsne.fit_transform(outputs.data.cpu().numpy())[:plot_only,:]
plot_labels = labels.cpu().numpy()[:plot_only]
plot_with_labels(low_dim_embs,plot_labels)
   
def plot_with_labels(lowDWeights, labels):
    plt.cla()
    X, Y = lowDWeights[:, 0], lowDWeights[:, 1]
    for x, y, s in zip(X, Y, labels):
        c = cm.rainbow(int(255 * s / 9)); plt.text(x, y, classes[s], backgroundcolor=c, fontsize=9)
    plt.xlim(X.min(), X.max()); plt.ylim(Y.min(), Y.max()); plt.title('Visualize last layer'); plt.show(); plt.pause(0.01)

分类结果可视化

参考资料

[1] Vishnu Subramanian. Deep Learning with PyTorch. Packet Publishing. 2018.
[2] 邱锡鹏著，神经网络与深度学习. https://nndl.github.io/ 2019.
[3] 肖智清著，神经网络与PyTorch实战. 北京：机械工业出版社. 2018.
[4] 唐进民编著，深度学习之PyTorch实战计算机视觉. 北京：电子工业出版社. 2018.
[5] Ian Goodfellow 等著, 赵申剑等译, 深度学习. 北京：人民邮电出版社, 2017.
[6] https://github.com/harsh-99/PyTorch-Tutorials

布被秋宵梦觉，眼前万里江山. ——辛弃疾《清平乐·独宿博山王氏庵》

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 217,734评论 6赞 505
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 92,931评论 3赞 394
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 164,133评论 0赞 354
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 58,532评论 1赞 293
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 67,585评论 6赞 392
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 51,462评论 1赞 302
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 40,262评论 3赞 418
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 39,153评论 0赞 276
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 45,587评论 1赞 314
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 37,792评论 3赞 336
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 39,919评论 1赞 348
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 35,635评论 5赞 345
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 41,237评论 3赞 329
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 31,855评论 0赞 22
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 32,983评论 1赞 269
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 48,048评论 3赞 370
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 44,864评论 2赞 354