Deep Belief Network

参考资料:

http://deeplearning.net/tutorial/DBN.html

Part 1

class DBN(object):

    """Deep Belief Network

    在几个RBM上相互得到一个DBN堆叠.。

    RBM隐藏层中的`i`层 会变层`i+1`层的输入。

    得到的第一层RBM作为输入,而最后一层为输出。

    至于分类器, the DBN is treated as a MLP  by adding a logistic

    regression layer on top.

    """

    def __init__(self, numpy_rng, theano_rng=None, n_ins=784,

                hidden_layers_sizes=[500, 500], n_outs=10):

        """这个类支持可变数量的层。

        :type numpy_rng: numpy.random.RandomState

        :param numpy_rng: 用来绘制初始权重的随机数值生成器

        :type theano_rng: theano.tensor.shared_randomstreams.RandomStreams

        :param theano_rng: Theano 随机生成器;如果没有给出,则 根据从“rng”中抽取的种子生成一个

        :type n_ins: int 浮点

        :param n_ins:  DBN输入的维度

        :type hidden_layers_sizes: list of ints

        :param hidden_layers_sizes: intermediate layers size, 必须包括至少一个值

        :type n_outs: int

        :param n_outs: DBN输出的维度

        """

        self.sigmoid_layers = []

        self.rbm_layers = []

        self.params = []

        self.n_layers = len(hidden_layers_sizes)

        assert self.n_layers > 0

        if not theano_rng:

            theano_rng = MRG_RandomStreams(numpy_rng.randint(2 ** 30))

        # 为数据分配符号变量

        # 数据是 rasterized images

        self.x = T.matrix('x')

        #标签是 1D vector of [int] labels

        self.y = T.ivector('y')



Part 2

for i in range(self.n_layers):

            # 构建 sigmoidal layer

            # 输入的大小就是隐藏层的数量

            # units of the layer below or the input size if we are on

            # 第一层

            if i == 0:

                input_size = n_ins

            else:

                input_size = hidden_layers_sizes[i - 1]

            # the input to this layer is either the activation of the

            # hidden layer below or the input of the DBN if you are on

            # 第一层

            if i == 0:

                layer_input = self.x

            else:

                layer_input = self.sigmoid_layers[-1].output

            sigmoid_layer = HiddenLayer(rng=numpy_rng,

                                        input=layer_input,

                                        n_in=input_size,

                                        n_out=hidden_layers_sizes[i],

                                        activation=T.nnet.sigmoid)

            # 加层

            self.sigmoid_layers.append(sigmoid_layer)

            # ...  要搞清楚sigmoid层的参数是DBN的参数

            # RBM的可视的偏差是那些 RBMs的参数

            self.params.extend(sigmoid_layer.params)

            # 构建RBM 共享这一层的权重

            rbm_layer = RBM(numpy_rng=numpy_rng,

                            theano_rng=theano_rng,

                            input=layer_input,

                            n_visible=input_size,

                            n_hidden=hidden_layers_sizes[i],

                            W=sigmoid_layer.W,

                            hbias=sigmoid_layer.b)

            self.rbm_layers.append(rbm_layer)



Part 3

self.logLayer = LogisticRegression(

            input=self.sigmoid_layers[-1].output,

            n_in=hidden_layers_sizes[-1],

            n_out=n_outs)

        self.params.extend(self.logLayer.params)

        # 计算第二阶段训练的损失,定义为逻辑回归层--输出层的负对数似然

        self.finetune_cost = self.logLayer.negative_log_likelihood(self.y)

        # compute the gradients with respect to the model parameters

        # symbolic variable that points to the number of errors made on the

        # minibatch given by self.x and self.y

        self.errors = self.logLayer.errors(self.y)



Part 4

def pretraining_functions(self, train_set_x, batch_size, k):

        '''Generates a list of functions, for performing one step of gradient descent at a given layer. The function will require as input the minibatch index, and to train an RBM you just need to iterate, calling the corresponding function on all minibatch indexes.

        :type train_set_x: theano.tensor.TensorType

        :param train_set_x: Shared var. that contains all datapoints used for training the RBM

        :type batch_size: int

        :param batch_size: size of a [mini]batch

        :param k: number of Gibbs steps to do in CD-k / PCD-k

        '''

        # index to a [mini]batch

        index = T.lscalar('index')  # index to a minibatch



Part 5

learning_rate = T.scalar('lr') # learning rate to use

        # begining of a batch, given `index`

        batch_begin = index * batch_size

        # ending of a batch given `index`

        batch_end = batch_begin + batch_size

        pretrain_fns = []

        for rbm in self.rbm_layers:

            # get the cost and the updates list

            # using CD-k here (persisent=None) for training each RBM.

            # TODO: change cost function to reconstruction error

            cost, updates = rbm.get_cost_updates(learning_rate,

                                                persistent=None, k=k)

            # compile the theano function

            fn = theano.function(

                inputs=[index, theano.In(learning_rate, value=0.1)],

                outputs=cost,

                updates=updates,

                givens={

                    self.x: train_set_x[batch_begin:batch_end]

                }

            )

            # append `fn` to the list of functions

            pretrain_fns.append(fn)

        return pretrain_fns



Part 6 

def build_finetune_functions(self, datasets, batch_size, learning_rate):

        '''Generates a function `train` that implements one step of

        finetuning, a function `validate` that computes the error on a

        batch from the validation set, and a function `test` that

        computes the error on a batch from the testing set

        :type datasets: list of pairs of theano.tensor.TensorType

        :param datasets: It is a list that contain all the datasets;

                        the has to contain three pairs, `train`,

                        `valid`, `test` in this order, where each pair

                        is formed of two Theano variables, one for the

                        datapoints, the other for the labels

        :type batch_size: int

        :param batch_size: size of a minibatch

        :type learning_rate: float

        :param learning_rate: learning rate used during finetune stage

        '''

        (train_set_x, train_set_y) = datasets[0]

        (valid_set_x, valid_set_y) = datasets[1]

        (test_set_x, test_set_y) = datasets[2]

        # compute number of minibatches for training, validation and testing

        n_valid_batches = valid_set_x.get_value(borrow=True).shape[0]

        n_valid_batches //= batch_size

        n_test_batches = test_set_x.get_value(borrow=True).shape[0]

        n_test_batches //= batch_size

        index = T.lscalar('index')  # index to a [mini]batch

        # compute the gradients with respect to the model parameters

        gparams = T.grad(self.finetune_cost, self.params)

        # compute list of fine-tuning updates

        updates = []

        for param, gparam in zip(self.params, gparams):

            updates.append((param, param - gparam * learning_rate))

        train_fn = theano.function(

            inputs=[index],

            outputs=self.finetune_cost,

            updates=updates,

            givens={

                self.x: train_set_x[

                    index * batch_size: (index + 1) * batch_size

                ],

                self.y: train_set_y[

                    index * batch_size: (index + 1) * batch_size

                ]

            }

        )

        test_score_i = theano.function(

            [index],

            self.errors,

            givens={

                self.x: test_set_x[

                    index * batch_size: (index + 1) * batch_size

                ],

                self.y: test_set_y[

                    index * batch_size: (index + 1) * batch_size

                ]

            }

        )

        valid_score_i = theano.function(

            [index],

            self.errors,

            givens={

                self.x: valid_set_x[

                    index * batch_size: (index + 1) * batch_size

                ],

                self.y: valid_set_y[

                    index * batch_size: (index + 1) * batch_size

                ]

            }

        )

        # Create a function that scans the entire validation set

        def valid_score():

            return [valid_score_i(i) for i in range(n_valid_batches)]

        # Create a function that scans the entire test set

        def test_score():

            return [test_score_i(i) for i in range(n_test_batches)]

        return train_fn, valid_score, test_score



Part 7

numpy_rng = numpy.random.RandomState(123)

    print('... building the model')

    # construct the Deep Belief Network

    dbn = DBN(numpy_rng=numpy_rng, n_ins=28 * 28,

              hidden_layers_sizes=[1000, 1000, 1000],

              n_outs=10)



Part 8

#########################

    # PRETRAINING THE MODEL #

    #########################

    print('... getting the pretraining functions')

    pretraining_fns = dbn.pretraining_functions(train_set_x=train_set_x,

                                                batch_size=batch_size,

                                                k=k)

    print('... pre-training the model')

    start_time = timeit.default_timer()

    # Pre-train layer-wise

    for i in range(dbn.n_layers):

        # go through pretraining epochs

        for epoch in range(pretraining_epochs):

            # go through the training set

            c = []

            for batch_index in range(n_train_batches):

                c.append(pretraining_fns[i](index=batch_index,

                                            lr=pretrain_lr))

            print('Pre-training layer %i, epoch %d, cost ' % (i, epoch), end=' ')

            print(numpy.mean(c, dtype='float64'))

    end_time = timeit.default_timer()


With the default parameters, the code runs for 100 pre-training epochs with mini-batches of size 10. This corresponds to performing 500,000 unsupervised parameter updates. We use an unsupervised learning rate of 0.01, with a supervised learning rate of 0.1. The DBN itself consists of three hidden layers with 1000 units per layer. With early-stopping, this configuration achieved a minimal validation error of 1.27 with corresponding test error of 1.34 after 46 supervised epochs.

On an Intel(R) Xeon(R) CPU X5560 running at 2.80GHz, using a multi-threaded MKL library (running on 4 cores), pretraining took 615 minutes with an average of 2.05 mins/(layer * epoch). Fine-tuning took only 101 minutes or approximately 2.20 mins/epoch.

Hyper-parameters were selected by optimizing on the validation error. We tested unsupervised learning rates in {10 -1,10-5}

 and supervised learning rates in {10-1,10-4}

. We did not use any form of regularization besides early-stopping, nor did we optimize over the number of pretraining updates.

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 204,293评论 6 478
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 85,604评论 2 381
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 150,958评论 0 337
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,729评论 1 277
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,719评论 5 366
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,630评论 1 281
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 38,000评论 3 397
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,665评论 0 258
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 40,909评论 1 299
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,646评论 2 321
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,726评论 1 330
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,400评论 4 321
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 38,986评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,959评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,197评论 1 260
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 44,996评论 2 349
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,481评论 2 342

推荐阅读更多精彩内容