[tf]LSTM

创建一个简单的LSTM

在tensorflow中通过一句简单的命令就可以实现一个完整的LSTM结构。

lstm = tf.nn.rnn_cell.BasicLSTMCell(lstm_hidden_size)

将LSTM中的初始状态初始化全0数组使用.zero_state函数

state = lstm.zero_state(batch_size,tf.float32)
for i in range(num_steps):
    output,state = lstm.call(input,state)

创建多层的LSTM

创建深层的循环神经网络,同样可以使用 zero_state进行初始化。

lstm_cell  = tf.nn.rnn_cell.BasicLSTMCell(lstm_hidden_size)
stacked_lstm = tf.nn.rnn_cell.MultiRNNCell([lstm_cell(lstm_size) for _ in range(number_of_layers])
state = stacked_lstm.zero_state(batch_size,tf.float32)1

LSTM中使用Dropout

tf.nn.rnn_cell.DropoutWrapper(
    cell,
    input_keep_prob=1.0,
    output_keep_prob=1.0,
    state_keep_prob=1.0,
    variational_recurrent=False,
    input_size=None,
    dtype=None,
    seed=None,
    dropout_state_filter_visitor=None
)
tf.nn.rnn_cell.BasicLSTMCell
stacked_lstm = tf.nn.rnn_cell.MultiRNNCell([tf.nn.rnn_cell.DropoutWrapper(lstm_cell(lstm_size)) for _ in range(number_of_layers])

BiLSTM

tf.nn.bidirectional_dynamic_rnn(
    cell_fw, 
    cell_bw, 
    inputs, 
    initial_state_fw=None, 
    initial_state_bw=None, 
    sequence_length=None, 
    dtype=None, 
    parallel_iterations=None, 
    swap_memory=False, 
    time_major=False, 
    scope=None
)

输出 (outputs, output_states) :

  • outputs: 输出是 time_steps 步里所有的输出, 它是一个元组(output_fw, output_bw)包含了前向和后向的输出结果,每一个结果的形状为 [batch_size, max_time, cell_fw.output_size]
    It returns a tuple instead of a single concatenated Tensor. If the concatenated one is preferred, the forward and backward outputs can be concatenated as tf.concat(outputs, 2)
  • output_states:是一个元组 (output_state_fw, output_state_bw),包含前向和后向的最后一步的状态。

dynamic_rnn

  • 使用dynamic_rnn的时候每个batch的最大序列长度不需要相同,第一个batch的维度可以是2 * 4,第二个batch的维度是2 * 7,在训练的时候dynamic_rnn会根据每个batch的最大长度动态的展开到需要的层数,这就是它被称为dynamic的原因。
    可以看到虽然上面的LSTM可以是一个batch的输入,但是每次运算LSTM的时候只能在时间步上前进一步,相当于PyTorch中的LSTMCell,那么什么函数相当于PyTorch中的LSTM呢,答案是tf.nn.dynamic_rnn
tf.nn.dynamic_rnn(cell, inputs, 
    initial_state=None, 
    sequence_length=None, 
    dtype=None, 
    parallel_iterations=None, 
    swap_memory=False, 
    time_major=False, 
    scope=None
)

输入参数

  • cell: 一个 RNNCell 实例对象
  • inputs: RNN 的输入序列
  • initial_state: RNN 的初始状态, If cell.state_size is an integer, this must be a Tensor of appropriate type and shape [batch_size, cell.state_size]. If cell.state_size is a tuple, this should be a tuple of tensors having shapes [batch_size, s] for s in cell.state_size.
  • sequence_length: 形状为 [batch_size]其中的每一个值为 sequence length(即 time_steps), eg:sequence_length=tf.fill([batch_size], time_steps)
  • time_major: 默认为 False,输入和输出张量的形状为 [batch_size, max_time, depth];当取 True 的时候, it avoids transposes at the beginning and end of the RNN calculation,输入和输出张量的形状为 [max_time, batch_size, depth]
  • scope: VariableScope for the created subgraph; defaults to “rnn”.

输出:

  • outputs:是 time_steps 步里所有的输出,形状为[batch_size, max_time, cell.output_size]
  • state:是最后一步的隐状态,形状为[batch_size, cell.state_size]
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容