UFLDL新版教程与编程练习（十一）：Self Taught Learning（自我学习）

UFLDL是吴恩达团队编写的较早的一门深度学习入门，里面理论加上练习的节奏非常好，每次都想快点看完理论去动手编写练习，因为他帮你打好了整个代码框架，也有详细的注释，所以我们只要实现一点核心的代码编写工作就行了，上手快！

我这里找不到新版对应这块的中文翻译了，-_-，这是最后一个练习了，之后就可以学cs231n了，得加快啊！
第十一节是：Self Taught Learning（自我学习）

feedfowardRICA.m

function features = feedfowardRICA(filterDim, poolDim, numFilters, images, W)
% feedfowardRICA Returns the convolution of the features given by W with
% the given images. It should be very similar to cnnConvolve.m+cnnPool.m 
% in the CNN exercise, except that there is no bias term b, and the pooling
% is RICA-style square-square-root pooling instead of average pooling.
%
% Parameters:
%  filterDim - filter (feature) dimension
%  numFilters - number of feature maps
%  images - large images to convolve with, matrix in the form
%           images(r, c, image number)
%  W    - W should be the weights learnt using RICA
%         W is of shape (filterDim,filterDim,numFilters)
%
% Returns:
%  features - matrix of convolved and pooled features in the form
%                      features(imageRow, imageCol, featureNum, imageNum)
global params;
numImages = size(images, 3);
imageDim = size(images, 1);
convDim = imageDim - filterDim + 1; % 20

features = zeros(convDim / poolDim, ...
        convDim / poolDim, numFilters, numImages); % 10 * 10 * 32 * numImages
poolMat = ones(poolDim);
% Instructions:
%   Convolve every filter with every image just like what you did in
%   cnnConvolve.m to get a response.
%   Then perform square-square-root pooling on the response with 3 steps:
%      1. Square every element in the response
%      2. Sum everything in each pooling region
%      3. add params.epsilon to every element before taking element-wise square-root
%      (Hint: use poolMat similarly as in cnnPool.m)



for imageNum = 1:numImages
  if mod(imageNum,500)==0
    fprintf('forward-prop image %d\n', imageNum);
  end
  for filterNum = 1:numFilters

    % filter = zeros(8,8); % You should replace this
    % Form W, obtain the feature (filterDim x filterDim) needed during the
    % convolution
    %%% YOUR CODE HERE %%%
    filter = squeeze(W(:,:,filterNum));

    % Flip the feature matrix because of the definition of convolution, as explained later
    filter = rot90(squeeze(filter),2);
      
    % Obtain the image
    im = squeeze(images(:, :, imageNum));

    % resp = zeros(convDim, convDim); % You should replace this
    % Convolve "filter" with "im" to find "resp"
    % be sure to do a 'valid' convolution
    %%% YOUR CODE HERE %%%
    resp = conv2(im,filter,'valid');
    
    % Then, apply square-square-root pooling on "resp" to get the hidden
    % activation "act"
    act = zeros(convDim / poolDim, convDim / poolDim); % You should replace this 20/5
    %%% YOUR CODE HERE %%%
    resp1 = resp .^2;
    for i = 1: convDim / poolDim
        for j = 1: convDim / poolDim
            temp = conv2(resp1,poolMat,'valid');
            act(i,j) = temp(poolDim*(i-1)+1,poolDim*(j-1)+1);
            act(i,j) = sqrt(act(i,j) + params.epsilon);
        end
    end
    
    
    features(:, :, filterNum, imageNum) = act;
  end
end

end

脚本stlExercise.m

%% CS294A/CS294W Self-taught Learning Exercise

%  Instructions
%  ------------
% 
%  This file contains code that helps you get started on the
%  self-taught learning. You will need to complete code in feedForwardAutoencoder.m
%  You will also need to have implemented sparseAutoencoderCost.m and 
%  softmaxCost.m from previous exercises. 好像都没有对应的，这应该是老版的
%
%% ======================================================================
%  STEP 0: Here we provide the relevant parameters values that will
%  allow your RICA to get good filters; you do not need to 
%  change the parameters below.
clear;close all;
addpath(genpath('E:\SummerCourse\UFLDL\stanford_dl_ex-master\common')) % path to minfunc
imgSize = 28;
global params;
params.patchWidth=9;           % width of a patch
params.n=params.patchWidth^2;   % dimensionality of input to RICA
params.lambda = 0.0005;   % sparsity cost
params.numFeatures = 32; % number of filter banks to learn
params.epsilon = 1e-2;   

%% ======================================================================
%  STEP 1: Load data from the MNIST database
%
%  This loads our training and test data from the MNIST database files.
%  We have sorted the data for you in this so that you will not have to
%  change it.

% Load MNIST database files
mnistData   = loadMNISTImages('E:\SummerCourse\UFLDL\common\train-images-idx3-ubyte'); % 784*60000
mnistLabels = loadMNISTLabels('E:\SummerCourse\UFLDL\common\train-labels-idx1-ubyte'); % 60000*1

numExamples = size(mnistData, 2);
% 50000 of the data are pretended to be unlabelled
unlabeledSet = 1:50000;
unlabeledData = mnistData(:, unlabeledSet);

% the rest are equally splitted into labelled train and test data


trainSet = 50001:55000;
testSet = 55001:60000;
trainData   = mnistData(:, trainSet);
trainLabels = mnistLabels(trainSet)' + 1; % Shift Labels to the Range 1-10
% only keep digits 0-4, so that unlabelled dataset has different distribution
% than the labelled one.
removeSet = find(trainLabels > 5);
trainData(:,removeSet)= [] ;
trainLabels(removeSet) = [];

testData   = mnistData(:, testSet);
testLabels = mnistLabels(testSet)' + 1;   % Shift Labels to the Range 1-10
% only keep digits 0-4
removeSet = find(testLabels > 5);
testData(:,removeSet)= [] ;
testLabels(removeSet) = [];


% Output Some Statistics
fprintf('# examples in unlabeled set: %d\n\n', size(unlabeledData, 2));
fprintf('# examples in supervised training set trainData: %d\n\n', size(trainData, 2));
fprintf('# examples in supervised testing set testData: %d\n\n', size(testData, 2));

%% ======================================================================
%  STEP 2: Train the RICA
%  This trains the RICA on the unlabeled training images. 

%  Randomly initialize the parameters
randTheta = randn(params.numFeatures,params.n)*0.01;  % 1/sqrt(params.n); 32*81
randTheta = randTheta ./ repmat(sqrt(sum(randTheta.^2,2)), 1, size(randTheta,2)); 
randTheta = randTheta(:); % 2591

% subsample random patches from the unlabelled+training data,但是新版教程上说只拿unlabelled的
% patches = samplePatches([unlabeledData,trainData],params.patchWidth,200000); % 81*200000
patches = samplePatches(unlabeledData,params.patchWidth,200000); % 81*200000

%configure minFunc
options.Method = 'lbfgs';
options.MaxFunEvals = Inf;
options.MaxIter = 1000;
% You'll need to replace this line with RICA training code
% opttheta = randTheta;

%  Find opttheta by running the RICA on all the training patches.
%  You will need to whitened the patches with the zca2 function 
%  then call minFunc with the softICACost function as seen in the RICA exercise.
%%% YOUR CODE HERE %%%
patches = zca2(patches); 
m = sqrt(sum(patches.^2) + (1e-8));
x = bsxfunwrap(@rdivide,patches,m);
% 这里之前因为softICACost.m函数里面的lambda写的是1，太大了，所以没迭代几次就停下来了
% 改小一点以后就可以了，正确率也上升了，达到了教程中的标准，看W'的图也能看出大概来，
tic;
[opttheta, cost, exitflag] = minFunc( @(theta) softICACost(theta, x, params), randTheta, options); 
fprintf('# Optimization took: %f seconds.\n', toc);
% reshape visualize weights
W = reshape(opttheta, params.numFeatures, params.n); % 32*81
display_network(W');

%% ======================================================================

%% STEP 3: Extract Features from the Supervised Dataset
% pre-multiply the weights with whitening matrix, equivalent to whitening
% each image patch before applying convolution. V should be the same V
% returned by the zca2 when you whiten the patches.
% W = W*V; % V是啥，一脸懵逼，先注释掉再说
%  reshape RICA weights to be convolutional weights.
W = reshape(W, params.numFeatures, params.patchWidth, params.patchWidth);
W = permute(W, [2,3,1]); % patchWidth * patchWidth * numFeatures

%  setting up convolutional feed-forward. You do need to modify this code.
filterDim = params.patchWidth;
poolDim = 5;
numFilters = params.numFeatures;
trainImages=reshape(trainData, imgSize, imgSize, size(trainData, 2));
testImages=reshape(testData, imgSize, imgSize, size(testData, 2));
%  Compute convolutional responses
%  TODO: You will need to complete feedfowardRICA.m ，这个出来的是 经过卷积，池化后的隐层特征
trainAct = feedfowardRICA(filterDim, poolDim, numFilters, trainImages, W);
fprintf('# 从2500回到500我以为出错了，结果是下一个feedfowardRICA\n');
testAct = feedfowardRICA(filterDim, poolDim, numFilters, testImages, W);
%  reshape the responses into feature vectors
featureSize = size(trainAct,1)*size(trainAct,2)*size(trainAct,3); % 512
trainFeatures = reshape(trainAct, featureSize, size(trainData, 2)); % 512*2538
testFeatures = reshape(testAct, featureSize, size(testData, 2)); %512*2520
%% ======================================================================
%% STEP 4: Train the softmax classifier

numClasses  = 5; % doing 5-class digit recognition
% initialize softmax weights randomly
randTheta2 = randn(numClasses, featureSize)*0.01;  % 1/sqrt(params.n);
randTheta2 = randTheta2 ./ repmat(sqrt(sum(randTheta2.^2,2)), 1, size(randTheta2,2)); 
randTheta2 = randTheta2';
randTheta2 = randTheta2(:);

%  Use minFunc and softmax_regression_vec from the previous exercise to 
%  train a multi-class classifier. 
options.Method = 'lbfgs';
options.MaxFunEvals = Inf;
options.MaxIter = 300;

% optimize
%%% YOUR CODE HERE %%%
[opt_theta, ~, ~] = minFunc(@softmax_regression_vec, randTheta2, options, trainFeatures, trainLabels);
opt_theta = reshape(opt_theta,featureSize,numClasses);
opt_theta = opt_theta'; % numClasses * featureSize
%%======================================================================
%% STEP 5: Testing 
% Compute Predictions on tran and test sets using softmaxPredict
% and softmaxModel（哪有啊，这是老版的）
%%% YOUR CODE HERE %%%
[~,train_pred] = max(opt_theta * trainFeatures); % opt_theta：2560*1(跟randTheta2的size一样) trainFeatures:512*2538
[~,pred] = max(opt_theta * testFeatures); % testFeatures:512*2520

% Classification Score
fprintf('Train Accuracy: %f%%\n', 100*mean(train_pred(:) == trainLabels(:))); % trainLabels:1*2538
fprintf('Test Accuracy: %f%%\n', 100*mean(pred(:) == testLabels(:))); % testLabels:1*2520
% You should get 100% train accuracy and ~99% test accuracy. With random
% convolutional weights we get 97.5% test accuracy. Actual results may
% vary as a result of random initializations

运行结果：
之前的softICACost.m中的 lambda = 1参数没调好，调成0.1以后就能跟教程里面准确率一样了：

self-taught learning result

教程原话：

If you’ve done all the steps correctly, you should get 100% train accuracy and ~99% test accuracy.

有理解不到位之处，还请指出，有更好的想法，可以在下方评论交流！

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 218,546评论 6赞 507
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 93,224评论 3赞 395
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 164,911评论 0赞 354
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 58,737评论 1赞 294
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 67,753评论 6赞 392
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 51,598评论 1赞 305
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 40,338评论 3赞 418
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 39,249评论 0赞 276
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 45,696评论 1赞 314
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 37,888评论 3赞 336
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 40,013评论 1赞 348
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 35,731评论 5赞 346
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 41,348评论 3赞 330
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 31,929评论 0赞 22
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 33,048评论 1赞 270
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 48,203评论 3赞 370
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 44,960评论 2赞 355

UFLDL新版教程与编程练习（十一）：Self Taught Learning（自我学习）

feedfowardRICA.m

脚本stlExercise.m

推荐阅读更多精彩内容