cs231n:assignment2——Q3: Dropout

视频里 Andrej Karpathy上课的时候说,这次的作业meaty but educational,确实很meaty,作业一般是由.ipynb文件和.py文件组成,这次因为每个.ipynb文件涉及到的.py文件较多,且互相之间有交叉,所以每篇博客只贴出一个.ipynb或者一个.py文件.(因为之前的作业由于是一个.ipynb文件对应一个.py文件,所以就整合到一篇博客里)
还是那句话,有错误希望帮我指出来,多多指教,谢谢
Dropout.ipynb内容:

[TOC]

Dropout

Dropout [1] is a technique for regularizing neural networks by randomly setting some features to zero during the forward pass. In this exercise you will implement a dropout layer and modify your fully-connected network to optionally use dropout.

[1] Geoffrey E. Hinton et al, "Improving neural networks by preventing co-adaptation of feature detectors", arXiv 2012

# As usual, a bit of setup

import time
import numpy as np
import matplotlib.pyplot as plt
from cs231n.classifiers.fc_net import *
from cs231n.data_utils import get_CIFAR10_data
from cs231n.gradient_check import eval_numerical_gradient, eval_numerical_gradient_array
from cs231n.solver import Solver

%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

# for auto-reloading external modules
# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_ext autoreload
%autoreload 2

def rel_error(x, y):
  """ returns relative error """
  return np.max(np.abs(x - y) / (np.maximum(1e-8, np.abs(x) + np.abs(y))))
# Load the (preprocessed) CIFAR10 data.

data = get_CIFAR10_data()
for k, v in data.iteritems():
  print '%s: ' % k, v.shape
X_val:  (1000, 3, 32, 32)
X_train:  (49000, 3, 32, 32)
X_test:  (1000, 3, 32, 32)
y_val:  (1000,)
y_train:  (49000,)
y_test:  (1000,)

Dropout forward pass

In the file cs231n/layers.py, implement the forward pass for dropout. Since dropout behaves differently during training and testing, make sure to implement the operation for both modes.

Once you have done so, run the cell below to test your implementation.

x = np.random.randn(500, 500) + 10

for p in [0.3, 0.6, 0.75]:
  out, _ = dropout_forward(x, {'mode': 'train', 'p': p})
  out_test, _ = dropout_forward(x, {'mode': 'test', 'p': p})

  print 'Running tests with p = ', p
  print 'Mean of input: ', x.mean()
  print 'Mean of train-time output: ', out.mean()
  print 'Mean of test-time output: ', out_test.mean()
  print 'Fraction of train-time output set to zero: ', (out == 0).mean()
  print 'Fraction of test-time output set to zero: ', (out_test == 0).mean()
  print
Running tests with p =  0.3
Mean of input:  10.0008904838
Mean of train-time output:  9.9965610718
Mean of test-time output:  10.0008904838
Fraction of train-time output set to zero:  0.70018
Fraction of test-time output set to zero:  0.0

Running tests with p =  0.6
Mean of input:  10.0008904838
Mean of train-time output:  10.0017849856
Mean of test-time output:  10.0008904838
Fraction of train-time output set to zero:  0.39994
Fraction of test-time output set to zero:  0.0

Running tests with p =  0.75
Mean of input:  10.0008904838
Mean of train-time output:  10.007736608
Mean of test-time output:  10.0008904838
Fraction of train-time output set to zero:  0.249488
Fraction of test-time output set to zero:  0.0

Dropout backward pass

In the file cs231n/layers.py, implement the backward pass for dropout. After doing so, run the following cell to numerically gradient-check your implementation.

x = np.random.randn(10, 10) + 10
dout = np.random.randn(*x.shape)

dropout_param = {'mode': 'train', 'p': 0.8, 'seed': 123}
out, cache = dropout_forward(x, dropout_param)
dx = dropout_backward(dout, cache)
dx_num = eval_numerical_gradient_array(lambda xx: dropout_forward(xx, dropout_param)[0], x, dout)

print 'dx relative error: ', rel_error(dx, dx_num)
dx relative error:  5.44560491149e-11

Fully-connected nets with Dropout

In the file cs231n/classifiers/fc_net.py, modify your implementation to use dropout. Specificially, if the constructor the the net receives a nonzero value for the dropout parameter, then the net should add dropout immediately after every ReLU nonlinearity. After doing so, run the following to numerically gradient-check your implementation.

N, D, H1, H2, C = 2, 15, 20, 30, 10
X = np.random.randn(N, D)
y = np.random.randint(C, size=(N,))

for dropout in [0, 0.25, 0.5]:
  print 'Running check with dropout = ', dropout
  model = FullyConnectedNet([H1, H2], input_dim=D, num_classes=C,
                            weight_scale=5e-2, dtype=np.float64,
                            dropout=dropout, seed=123)

  loss, grads = model.loss(X, y)
  print 'Initial loss: ', loss

  for name in sorted(grads):
    f = lambda _: model.loss(X, y)[0]
    grad_num = eval_numerical_gradient(f, model.params[name], verbose=False, h=1e-5)
    print '%s relative error: %.2e' % (name, rel_error(grad_num, grads[name]))
  print
Running check with dropout =  0
Initial loss:  2.3051948274
W1 relative error: 2.53e-07
W2 relative error: 1.50e-05
W3 relative error: 2.75e-07
b1 relative error: 2.94e-06
b2 relative error: 5.05e-08
b3 relative error: 1.17e-10

Running check with dropout =  0.25
Initial loss:  2.31264683457
W1 relative error: 1.48e-08
W2 relative error: 2.34e-10
W3 relative error: 3.56e-08
b1 relative error: 1.53e-09
b2 relative error: 1.84e-10
b3 relative error: 8.70e-11

Running check with dropout =  0.5
Initial loss:  2.30243758771
W1 relative error: 4.55e-08
W2 relative error: 2.97e-08
W3 relative error: 4.34e-07
b1 relative error: 1.87e-08
b2 relative error: 5.05e-09
b3 relative error: 7.49e-11

Regularization experiment

As an experiment, we will train a pair of two-layer networks on 500 training examples: one will use no dropout, and one will use a dropout probability of 0.75. We will then visualize the training and validation accuracies of the two networks over time.

# Train two identical nets, one with dropout and one without

num_train = 500
small_data = {
  'X_train': data['X_train'][:num_train],
  'y_train': data['y_train'][:num_train],
  'X_val': data['X_val'],
  'y_val': data['y_val'],
}

solvers = {}
dropout_choices = [0, 0.75]
for dropout in dropout_choices:
  model = FullyConnectedNet([500], dropout=dropout)
  print dropout

  solver = Solver(model, small_data,
                  num_epochs=25, batch_size=100,
                  update_rule='adam',
                  optim_config={
                    'learning_rate': 5e-4,
                  },
                  verbose=True, print_every=100)
  solver.train()
  solvers[dropout] = solver
0
(Iteration 1 / 125) loss: 8.596245
(Epoch 0 / 25) train acc: 0.224000; val_acc: 0.183000
(Epoch 1 / 25) train acc: 0.382000; val_acc: 0.219000
(Epoch 2 / 25) train acc: 0.484000; val_acc: 0.248000
(Epoch 3 / 25) train acc: 0.620000; val_acc: 0.274000
(Epoch 4 / 25) train acc: 0.654000; val_acc: 0.246000
(Epoch 5 / 25) train acc: 0.726000; val_acc: 0.280000
(Epoch 6 / 25) train acc: 0.788000; val_acc: 0.304000
(Epoch 7 / 25) train acc: 0.818000; val_acc: 0.264000
(Epoch 8 / 25) train acc: 0.846000; val_acc: 0.270000
(Epoch 9 / 25) train acc: 0.896000; val_acc: 0.288000
(Epoch 10 / 25) train acc: 0.926000; val_acc: 0.297000
(Epoch 11 / 25) train acc: 0.964000; val_acc: 0.276000
(Epoch 12 / 25) train acc: 0.950000; val_acc: 0.275000
(Epoch 13 / 25) train acc: 0.964000; val_acc: 0.299000
(Epoch 14 / 25) train acc: 0.952000; val_acc: 0.275000
(Epoch 15 / 25) train acc: 0.974000; val_acc: 0.291000
(Epoch 16 / 25) train acc: 0.984000; val_acc: 0.290000
(Epoch 17 / 25) train acc: 0.968000; val_acc: 0.286000
(Epoch 18 / 25) train acc: 0.974000; val_acc: 0.297000
(Epoch 19 / 25) train acc: 0.972000; val_acc: 0.275000
(Epoch 20 / 25) train acc: 0.994000; val_acc: 0.296000
(Iteration 101 / 125) loss: 0.021468
(Epoch 21 / 25) train acc: 0.998000; val_acc: 0.298000
(Epoch 22 / 25) train acc: 0.994000; val_acc: 0.306000
(Epoch 23 / 25) train acc: 0.992000; val_acc: 0.303000
(Epoch 24 / 25) train acc: 0.994000; val_acc: 0.310000
(Epoch 25 / 25) train acc: 0.998000; val_acc: 0.303000
0.75
(Iteration 1 / 125) loss: 10.053350
(Epoch 0 / 25) train acc: 0.274000; val_acc: 0.230000
(Epoch 1 / 25) train acc: 0.352000; val_acc: 0.211000
(Epoch 2 / 25) train acc: 0.444000; val_acc: 0.269000
(Epoch 3 / 25) train acc: 0.566000; val_acc: 0.263000
(Epoch 4 / 25) train acc: 0.650000; val_acc: 0.257000
(Epoch 5 / 25) train acc: 0.680000; val_acc: 0.280000
(Epoch 6 / 25) train acc: 0.768000; val_acc: 0.310000
(Epoch 7 / 25) train acc: 0.774000; val_acc: 0.270000
(Epoch 8 / 25) train acc: 0.824000; val_acc: 0.274000
(Epoch 9 / 25) train acc: 0.894000; val_acc: 0.288000
(Epoch 10 / 25) train acc: 0.876000; val_acc: 0.282000
(Epoch 11 / 25) train acc: 0.924000; val_acc: 0.312000
(Epoch 12 / 25) train acc: 0.936000; val_acc: 0.309000
(Epoch 13 / 25) train acc: 0.894000; val_acc: 0.285000
(Epoch 14 / 25) train acc: 0.938000; val_acc: 0.281000
(Epoch 15 / 25) train acc: 0.948000; val_acc: 0.323000
(Epoch 16 / 25) train acc: 0.930000; val_acc: 0.303000
(Epoch 17 / 25) train acc: 0.944000; val_acc: 0.286000
(Epoch 18 / 25) train acc: 0.930000; val_acc: 0.316000
(Epoch 19 / 25) train acc: 0.984000; val_acc: 0.311000
(Epoch 20 / 25) train acc: 0.964000; val_acc: 0.310000
(Iteration 101 / 125) loss: 0.598348
(Epoch 21 / 25) train acc: 0.972000; val_acc: 0.332000
(Epoch 22 / 25) train acc: 0.990000; val_acc: 0.310000
(Epoch 23 / 25) train acc: 0.988000; val_acc: 0.300000
(Epoch 24 / 25) train acc: 0.978000; val_acc: 0.296000
(Epoch 25 / 25) train acc: 0.994000; val_acc: 0.310000
# Plot train and validation accuracies of the two models

train_accs = []
val_accs = []
for dropout in dropout_choices:
  solver = solvers[dropout]
  train_accs.append(solver.train_acc_history[-1])
  val_accs.append(solver.val_acc_history[-1])

plt.subplot(3, 1, 1)
for dropout in dropout_choices:
  plt.plot(solvers[dropout].train_acc_history, 'o', label='%.2f dropout' % dropout)
plt.title('Train accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend(ncol=2, loc='lower right')
  
plt.subplot(3, 1, 2)
for dropout in dropout_choices:
  plt.plot(solvers[dropout].val_acc_history, 'o', label='%.2f dropout' % dropout)
plt.title('Val accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend(ncol=2, loc='lower right')

plt.gcf().set_size_inches(15, 15)
plt.show()
output11

Question

Explain what you see in this experiment. What does it suggest about dropout?

Answer

training accuracy is almost equal,both nearly 100%, but 0.75 drop out net's val accuracy is a little bit higher than none drop out net's.
this may suggest that drop out can regularize net, prevent overfit.

可以看到,两个网络的训练精度都达到了100%,但是运用了drop out的网络的val accuracy要比没用drop out 的网络要高那么一点.
可以看出来,drop out 起到了正则化向的作用,可以防止过拟合

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 215,245评论 6 497
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,749评论 3 391
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 160,960评论 0 350
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,575评论 1 288
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,668评论 6 388
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,670评论 1 294
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,664评论 3 415
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,422评论 0 270
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,864评论 1 307
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,178评论 2 331
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,340评论 1 344
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,015评论 5 340
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,646评论 3 323
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,265评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,494评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,261评论 2 368
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,206评论 2 352

推荐阅读更多精彩内容