22 循环神经网络

上一篇介绍了卷基层，可以用来构建很常见的卷积神经网络等模型。那么今天将要介绍的是递归层，是一个可以用来构建递归网络(RNN)的基础部件。具体的RNN知识，可以参考文章：《深入探究递归神经网络》。如果感觉上面这篇文章比较抽象，那么强烈建议读者阅读一下《递归神经网络不可思议的有效性》，因为它结合实际讲述了RNN的强大。下面来看下递归层都有哪些结构。

一、SimpleRNN

keras.layers.recurrent.SimpleRNN(output_dim,  
        init='glorot_uniform', inner_init='orthogonal', activation='sigmoid', weights=None,  
        truncate_gradient=-1, return_sequences=False, input_dim=None, input_length=None)

一种输出反馈给输入的全连接RNN。
** inputshape: 3维 tensor(nb_samples, timesteps,input_dim)
** outputshape: 如果return_sequences=True，那么输出3维 tensor(nb_samples, timesteps, output_dim) .否则输出2维tensor(nb_samples,output_dim)。
** Masking：This layer supports masking forinput data with a variable number of timesteps To introduce masks to your data,use an Embedding layer with themask_zero parameter set toTrue.
** 参数：

output_dim : 内部计算和最终输出的维度。
init : 初始化权值的函数名称或Theano function。可以使用Keras内置的（内置初始化权值函数见这里），也可以传递自己编写的Theano function。如果不给weights传递参数时，则该参数必须指明。
activation : 激活函数名称或者Theano function。可以使用Keras内置的（内置激活函数见这里），也可以是传递自己编写的Theano function。如果不明确指定，那么将没有激活函数会被应用。
weights :用于初始化权值的numpy arrays组成的list。这个List应该有3个元素，它们的shape是[(input_dim, output_dim), (output_dim, output_dim),(output_dim,)]
truncate_gradient: 在BPTT(back propgation throughtime, BP算法加入了时间维度)算法中的truncate步数。
return_sequence: Boolean.False返回在输出序列中的最后一个输出；True返回整个序列。
input_dim:输入数据的维度。当把该层作为模型的第一层时，这个参数和input_shape至少要提供一个传值。
input_length:输入序列的长度。This argument is required ifyou are going to connectFlatten thenDense layers upstream (without it,the shape of the dense outputs cannot be computed)

二、SimpleDeepRNN

keras.layers.recurrent.SimpleDeepRNN(output_dim,depth=3,  
        init='glorot_uniform', inner_init='orthogonal',  
        activation='sigmoid', inner_activation='hard_sigmoid',  
        weights=None, truncate_gradient=-1, return_sequences=False,  
        input_dim=None, input_length=None)

一种经过多步（由参数depth决定）计算输出反馈给输入的全连接RNN。示例代码如下：

output= activation(W.x_t + b +inner_activation(U_1.h_tm1) +inner_activation(U_2.h_tm2) + ...)

** inputshape: 3维 tensor(nb_samples, timesteps,input_dim)
** outputshape: 如果return_sequences=True，那么输出3维 tensor(nb_samples, timesteps, output_dim) .否则输出2维tensor(nb_samples,output_dim)。
** Masking：This layer supports masking forinput data with a variable number of timesteps To introduce masks to your data,use an Embedding layer with themask_zero parameter set toTrue.
** 参数：

output_dim : 内部计算和最终输出的维度。
depth : int>=1.循环迭代的次数。如果depth=1，那么就等价于SimpleRNN。
init : 初始化权值的函数名称或Theano function。可以使用Keras内置的（内置初始化权值函数见这里），也可以传递自己编写的Theano function。如果不给weights传递参数时，则该参数必须指明。
inner_init : 内部神经元的权值初始化。
activation : 激活函数名称或者Theano function。可以使用Keras内置的（内置激活函数见这里），也可以是传递自己编写的Theano function。如果不明确指定，那么将没有激活函数会被应用。
inner_activation: 内部隐层的激活函数。
weights :用于初始化权值的numpy arrays组成的list。这个List应该有depth+2个元素。
truncate_gradient: 在BPTT(back propgation throughtime, BP算法加入了时间维度)算法中的truncate步数。
return_sequence: Boolean.False返回在输出序列中的最后一个输出；True返回整个序列。
input_dim:输入数据的维度。当把该层作为模型的第一层时，这个参数和input_shape至少要提供一个传值。
input_length:输入序列的长度。This argument is required ifyou are going to connectFlatten thenDense layers upstream (without it,the shape of the dense outputs cannot be computed)

三、GRU

keras.layers.recurrent.GRU(output_dim,  
        init='glorot_uniform', inner_init='orthogonal',  
        activation='sigmoid', inner_activation='hard_sigmoid',  
        weights=None, truncate_gradient=-1, return_sequences=False,  
        input_dim=None, input_length=None)

GRU(Gated Recurrent Unit)单元(2014年提出)。是实现RNN模型的主要单元之一。
** inputshape: 3维 tensor(nb_samples, timesteps,input_dim)
** outputshape: 如果return_sequences=True，那么输出3维 tensor(nb_samples, timesteps, output_dim) .否则输出2维tensor(nb_samples,output_dim)。
** Masking：This layer supports masking forinput data with a variable number of timesteps To introduce masks to your data,use an Embedding layer with themask_zero parameter set toTrue.
** 参数：

output_dim : 内部计算和最终输出的维度。
init : 初始化权值的函数名称或Theano function。可以使用Keras内置的（内置初始化权值函数见这里），也可以传递自己编写的Theano function。如果不给weights传递参数时，则该参数必须指明。
inner_init : 内部神经元的权值初始化。
activation : 激活函数名称或者Theano function。可以使用Keras内置的（内置激活函数见这里），也可以是传递自己编写的Theano function。如果不明确指定，那么将没有激活函数会被应用。
inner_activation: 内部隐层的激活函数。
weights :用于初始化权值的numpy arrays组成的list。这个List应该有9个元素。
truncate_gradient: 在BPTT(back propgation throughtime, BP算法加入了时间维度)算法中的truncate步数。
return_sequence: Boolean.False返回在输出序列中的最后一个输出；True返回整个序列。
input_dim:输入数据的维度。当把该层作为模型的第一层时，这个参数和input_shape至少要提供一个传值。
input_length:输入序列的长度。This argument is required ifyou are going to connectFlatten thenDense layers upstream (without it,the shape of the dense outputs cannot be computed)
** 本小节参考文献**：
On the Properties of NeuralMachine Translation: Encoder–Decoder Approaches
Empirical Evaluation of GatedRecurrent Neural Networks on Sequence Modeling

四、LSTM

keras.layers.recurrent.LSTM(output_dim,  
        init='glorot_uniform', inner_init='orthogonal', forget_bias_init='one',  
        activation='tanh', inner_activation='hard_sigmoid',  
        weights=None, truncate_gradient=-1, return_sequences=False,  
        input_dim=None, input_length=None)

LSTM(Long-Short Term Memoryunit)单元(1997年Hochreiter提出)。是用来构建RNN网络的主要单元之一。
** inputshape: 3维 tensor(nb_samples, timesteps,input_dim)
** outputshape: 如果return_sequences=True，那么输出3维 tensor(nb_samples, timesteps, output_dim) .否则输出2维tensor(nb_samples,output_dim)。
** Masking：This layer supports masking forinput data with a variable number of timesteps To introduce masks to your data,use an Embedding layer with themask_zero parameter set toTrue.
** 参数：

output_dim : 内部计算和最终输出的维度。
init : 初始化权值的函数名称或Theano function。可以使用Keras内置的（内置初始化权值函数见这里），也可以传递自己编写的Theano function。如果不给weights传递参数时，则该参数必须指明。
inner_init : 内部神经元的权值初始化。
forget_bias_init: 初始化forget gate的偏置函数。Jozefowiczet al.推荐初始化为1。
activation : 激活函数名称或者Theano function。可以使用Keras内置的（内置激活函数见这里），也可以是传递自己编写的Theano function。如果不明确指定，那么将没有激活函数会被应用。
inner_activation: 内部隐层的激活函数。
weights :用于初始化权值的numpy arrays组成的list。这个List应该有12个元素。
truncate_gradient: 在BPTT(back propgation throughtime, BP算法加入了时间维度)算法中的truncate步数。
return_sequence: Boolean.False返回在输出序列中的最后一个输出；True返回整个序列。
input_dim:输入数据的维度。当把该层作为模型的第一层时，这个参数和input_shape至少要提供一个传值。
input_length:输入序列的长度。This argument is required ifyou are going to connectFlatten thenDense layers upstream (without it,the shape of the dense outputs cannot be computed)
**** 本小节参考文献：
Longshort-term memory (original 1997 paper)
Learningto forget: Continual prediction with LSTM
Supervised sequencelabelling with recurrent neural networks

五、JZS1, JZS2, JZS3

keras.layers.recurrent.JZS1(output_dim,  
        init='glorot_uniform', inner_init='orthogonal',  
        activation='tanh', inner_activation='sigmoid',  
        weights=None, truncate_gradient=-1, return_sequences=False,  
        input_dim=None, input_length=None)

**** ****是在近千种模型评估中进化而来的Top 3的RNN结构单元。它的作用与GRU和LSTM是一样的。其对应的MUT1, MUT2, 和MUT3结构是在《An Empirical Exploration of Recurrent NetworkArchitectures, Jozefowicz et al. 2015》中的提出来的。
**** inputshape: 3维 tensor(nb_samples, timesteps,input_dim)
**** outputshape: 如果return_sequences=True，那么输出3维 tensor(nb_samples, timesteps, output_dim) .否则输出2维tensor(nb_samples,output_dim)。
**** Masking：This layer supports masking forinput data with a variable number of timesteps To introduce masks to your data,use an Embedding layer with themask_zero parameter set toTrue.
**** 参数：

output_dim : 内部计算和最终输出的维度。
init : 初始化权值的函数名称或Theano function。可以使用Keras内置的（内置初始化权值函数见这里），也可以传递自己编写的Theano function。如果不给weights传递参数时，则该参数必须指明。
inner_init : 内部神经元的权值初始化。
forget_bias_init: 初始化forget gate的偏置函数。Jozefowiczet al.推荐初始化为1。
activation : 激活函数名称或者Theano function。可以使用Keras内置的（内置激活函数见这里），也可以是传递自己编写的Theano function。如果不明确指定，那么将没有激活函数会被应用。
inner_activation: 内部隐层的激活函数。
weights :用于初始化权值的numpy arrays组成的list。这个List应该有9个元素。
truncate_gradient: 在BPTT(back propgation throughtime, BP算法加入了时间维度)算法中的truncate步数。
return_sequence: Boolean.False返回在输出序列中的最后一个输出；True返回整个序列。
input_dim:输入数据的维度。当把该层作为模型的第一层时，这个参数和input_shape至少要提供一个传值。
input_length:输入序列的长度。This argument is required ifyou are going to connectFlatten thenDense layers upstream (without it,the shape of the dense outputs cannot be computed)
**** 本小节参考文献：
An EmpiricalExploration of Recurrent Network Architectures

原文地址

最后编辑于：2017.12.08 01:29:57

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 217,734评论 6赞 505
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 92,931评论 3赞 394
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 164,133评论 0赞 354
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 58,532评论 1赞 293
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 67,585评论 6赞 392
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 51,462评论 1赞 302
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 40,262评论 3赞 418
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 39,153评论 0赞 276
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 45,587评论 1赞 314
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 37,792评论 3赞 336
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 39,919评论 1赞 348
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 35,635评论 5赞 345
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 41,237评论 3赞 329
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 31,855评论 0赞 22
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 32,983评论 1赞 269
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 48,048评论 3赞 370
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 44,864评论 2赞 354