[tf]LSTM

创建一个简单的LSTM

在tensorflow中通过一句简单的命令就可以实现一个完整的LSTM结构。

lstm = tf.nn.rnn_cell.BasicLSTMCell(lstm_hidden_size)

将LSTM中的初始状态初始化全0数组使用.zero_state函数

state = lstm.zero_state(batch_size,tf.float32)
for i in range(num_steps):
    output,state = lstm.call(input,state)

创建多层的LSTM

创建深层的循环神经网络，同样可以使用 zero_state进行初始化。

lstm_cell  = tf.nn.rnn_cell.BasicLSTMCell(lstm_hidden_size)
stacked_lstm = tf.nn.rnn_cell.MultiRNNCell([lstm_cell(lstm_size) for _ in range(number_of_layers])
state = stacked_lstm.zero_state(batch_size,tf.float32)1

LSTM中使用Dropout

tf.nn.rnn_cell.DropoutWrapper(
    cell,
    input_keep_prob=1.0,
    output_keep_prob=1.0,
    state_keep_prob=1.0,
    variational_recurrent=False,
    input_size=None,
    dtype=None,
    seed=None,
    dropout_state_filter_visitor=None
)

tf.nn.rnn_cell.BasicLSTMCell
stacked_lstm = tf.nn.rnn_cell.MultiRNNCell([tf.nn.rnn_cell.DropoutWrapper(lstm_cell(lstm_size)) for _ in range(number_of_layers])

BiLSTM

tf.nn.bidirectional_dynamic_rnn(
    cell_fw, 
    cell_bw, 
    inputs, 
    initial_state_fw=None, 
    initial_state_bw=None, 
    sequence_length=None, 
    dtype=None, 
    parallel_iterations=None, 
    swap_memory=False, 
    time_major=False, 
    scope=None
)

输出 (outputs, output_states) ：

outputs：输出是 time_steps 步里所有的输出，它是一个元组(output_fw, output_bw)包含了前向和后向的输出结果，每一个结果的形状为 [batch_size, max_time, cell_fw.output_size]
It returns a tuple instead of a single concatenated Tensor. If the concatenated one is preferred, the forward and backward outputs can be concatenated as tf.concat(outputs, 2)
output_states：是一个元组 (output_state_fw, output_state_bw)，包含前向和后向的最后一步的状态。

dynamic_rnn

使用dynamic_rnn的时候每个batch的最大序列长度不需要相同，第一个batch的维度可以是2 * 4，第二个batch的维度是2 * 7，在训练的时候dynamic_rnn会根据每个batch的最大长度动态的展开到需要的层数，这就是它被称为dynamic的原因。
可以看到虽然上面的LSTM可以是一个batch的输入，但是每次运算LSTM的时候只能在时间步上前进一步，相当于PyTorch中的LSTMCell，那么什么函数相当于PyTorch中的LSTM呢，答案是tf.nn.dynamic_rnn

tf.nn.dynamic_rnn(cell, inputs, 
    initial_state=None, 
    sequence_length=None, 
    dtype=None, 
    parallel_iterations=None, 
    swap_memory=False, 
    time_major=False, 
    scope=None
)

输入参数：

cell：一个 RNNCell 实例对象
inputs： RNN 的输入序列
initial_state： RNN 的初始状态， If cell.state_size is an integer, this must be a Tensor of appropriate type and shape [batch_size, cell.state_size]. If cell.state_size is a tuple, this should be a tuple of tensors having shapes [batch_size, s] for s in cell.state_size.
sequence_length：形状为 [batch_size]， 其中的每一个值为 sequence length（即 time_steps）， eg：sequence_length=tf.fill([batch_size], time_steps)
time_major：默认为 False，输入和输出张量的形状为 [batch_size, max_time, depth]；当取 True 的时候， it avoids transposes at the beginning and end of the RNN calculation，输入和输出张量的形状为 [max_time, batch_size, depth]
scope： VariableScope for the created subgraph; defaults to “rnn”.

输出：

outputs：是 time_steps 步里所有的输出，形状为[batch_size, max_time, cell.output_size]
state：是最后一步的隐状态，形状为[batch_size, cell.state_size]

[tf]LSTM

创建一个简单的LSTM

创建多层的LSTM

LSTM中使用Dropout

BiLSTM

dynamic_rnn

推荐阅读更多精彩内容