(八)sequence to sequence —5

实现多层双向的dynamic_lstm+beam_search

基于tensorflow1.4 Seq2seq的实现

encoder使用的两层双向的LSTM,注意multi_RNN与bi_dynamic_lstm(并不兼容)

import helpers
import tensorflow as tf
from tensorflow.python.util import nest
from tensorflow.contrib import seq2seq,rnn

tf.__version__

tf.reset_default_graph()
sess = tf.InteractiveSession()

PAD = 0
EOS = 1


vocab_size = 10
input_embedding_size = 20
encoder_hidden_units = 25

decoder_hidden_units = encoder_hidden_units

import helpers as data_helpers
batch_size = 10

# 一个generator,每次产生一个minibatch的随机样本

batches = data_helpers.random_sequences(length_from=3, length_to=8,
                                   vocab_lower=2, vocab_upper=10,
                                   batch_size=batch_size)

print('产生%d个长度不一(最短3,最长8)的sequences, 其中前十个是:' % batch_size)
for seq in next(batches)[:min(batch_size, 10)]:
    print(seq)
    
tf.reset_default_graph()
sess = tf.InteractiveSession()
mode = tf.contrib.learn.ModeKeys.TRAIN
产生10个长度不一(最短3,最长8)的sequences, 其中前十个是:
[6, 6, 3, 9, 7, 7, 9, 4]
[9, 3, 6, 3, 6, 6, 4, 5]
[5, 4, 2, 2, 3, 9, 8, 7]
[3, 2, 7]
[8, 5, 9, 4, 5, 2]
[6, 5, 8, 9, 4]
[3, 9, 6, 5, 2, 2]
[3, 2, 2, 3]
[8, 8, 7, 6, 8]
[5, 3, 3, 6, 8, 7, 4, 9]

1.使用seq2seq库实现seq2seq模型

with tf.name_scope('minibatch'):
    encoder_inputs = tf.placeholder(tf.int32, [None, None], name='encoder_inputs')
    
    encoder_inputs_length = tf.placeholder(tf.int32, [None], name='encoder_inputs_length')
    
    decoder_targets = tf.placeholder(tf.int32, [None, None], name='decoder_targets')
    
    decoder_inputs = tf.placeholder(shape=(None, None),dtype=tf.int32,name='decoder_inputs')
    
    #decoder_inputs_length和decoder_targets_length是一样的
    decoder_inputs_length = tf.placeholder(shape=(None,),
                                            dtype=tf.int32,
                                            name='decoder_inputs_length')
    
# 构建embedding矩阵,encoder和decoder公用该词向量矩阵
embedding = tf.get_variable('embedding', [vocab_size,input_embedding_size])
encoder_inputs_embedded = tf.nn.embedding_lookup(embedding,encoder_inputs)

#fw_cell = bw_cell =  rnn.LSTMCell(encoder_hidden_units)

定义encoder,两层双向lstm

_inputs=encoder_inputs_embedded
for _ in range(2):
    #为什么在这加个variable_scope,被逼的,tf在rnn_cell的__call__中非要搞一个命名空间检查
    #恶心的很.如果不在这加的话,会报错的.
    with tf.variable_scope(None, default_name="bidirectional-rnn"):
        rnn_cell_bw =  rnn_cell_fw = rnn.LSTMCell(encoder_hidden_units)
        #rnn_cell_bw = rnn.LSTMCell(encoder_hidden_units)
        #initial_state_fw = rnn_cell_fw.zero_state(batch_size, dtype=tf.float32)
        #initial_state_bw = rnn_cell_bw.zero_state(batch_size, dtype=tf.float32)
        ((encoder_fw_outputs,encoder_bw_outputs),(encoder_fw_final_state,encoder_bw_final_state))\
        = tf.nn.bidirectional_dynamic_rnn(cell_fw=rnn_cell_fw,
                                              cell_bw=rnn_cell_bw, 
                                              inputs=_inputs, 
                                              sequence_length=encoder_inputs_length,
                                              dtype=tf.float32)
        _inputs = tf.concat((encoder_fw_outputs,encoder_bw_outputs), 2)
#取最后一层的 final_state    
encoder_final_state_h = tf.concat((encoder_fw_final_state.h, encoder_bw_final_state.h), 1)
encoder_final_state_c = tf.concat((encoder_fw_final_state.c, encoder_bw_final_state.c), 1)
encoder_final_state = rnn.LSTMStateTuple(c=encoder_final_state_c, h=encoder_final_state_h)
encoder_final_output = _inputs
    encoder_final_state
LSTMStateTuple(c=<tf.Tensor 'concat_3:0' shape=(?, 50) dtype=float32>, h=<tf.Tensor 'concat_2:0' shape=(?, 50) dtype=float32>)
    encoder_final_output
<tf.Tensor 'bidirectional-rnn_4/concat:0' shape=(?, ?, 50) dtype=float32>

5.定义decoder 部分

def _create_rnn_cell2():
    def single_rnn_cell(encoder_hidden_units):
        # 创建单个cell,这里需要注意的是一定要使用一个single_rnn_cell的函数,不然直接把cell放在MultiRNNCell
        # 的列表中最终模型会发生错误
        single_cell = rnn.LSTMCell(encoder_hidden_units*2)
        #添加dropout
        single_cell = rnn.DropoutWrapper(single_cell, output_keep_prob=0.5)
        return single_cell
            #列表中每个元素都是调用single_rnn_cell函数
            #cell = rnn.MultiRNNCell([single_rnn_cell() for _ in range(self.num_layers)])
    cell = rnn.MultiRNNCell([single_rnn_cell(encoder_hidden_units) for _ in range(1)])
    return cell 

with tf.variable_scope('decoder'):
    #single_cell = rnn.LSTMCell(encoder_hidden_units)
    #decoder_cell = rnn.MultiRNNCell([single_cell for _ in range(1)])
    decoder_cell = rnn.LSTMCell(encoder_hidden_units*2)
    #定义decoder的初始状态
    decoder_initial_state = encoder_final_state
    
    #定义output_layer
    output_layer = tf.layers.Dense(vocab_size,kernel_initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.1))
    
    decoder_inputs_embedded = tf.nn.embedding_lookup(embedding, decoder_inputs)
    
    # 训练阶段,使用TrainingHelper+BasicDecoder的组合,这一般是固定的,当然也可以自己定义Helper类,实现自己的功能
    training_helper = seq2seq.TrainingHelper(inputs=decoder_inputs_embedded,
                                                        sequence_length=decoder_inputs_length,
                                                        time_major=False, name='training_helper')
    training_decoder = seq2seq.BasicDecoder(cell=decoder_cell, helper=training_helper,
                                                       initial_state=decoder_initial_state,
                                                       output_layer=output_layer)
    
    # 调用dynamic_decode进行解码,decoder_outputs是一个namedtuple,里面包含两项(rnn_outputs, sample_id)
    # rnn_output: [batch_size, decoder_targets_length, vocab_size],保存decode每个时刻每个单词的概率,可以用来计算loss
    # sample_id: [batch_size], tf.int32,保存最终的编码结果。可以表示最后的答案
    max_target_sequence_length = tf.reduce_max(decoder_inputs_length, name='max_target_len')
    decoder_outputs, _, _ = seq2seq.dynamic_decode(decoder=training_decoder,
                                                          impute_finished=True,
                                                          maximum_iterations=max_target_sequence_length)
    decoder_logits_train = tf.identity(decoder_outputs.rnn_output)
    sample_id = decoder_outputs.sample_id
    max_target_sequence_length = tf.reduce_max(decoder_inputs_length, name='max_target_len')
    mask = tf.sequence_mask(decoder_inputs_length,max_target_sequence_length, dtype=tf.float32, name='masks')
    print('\t%s' % repr(decoder_logits_train))
    print('\t%s' % repr(decoder_targets))
    print('\t%s' % repr(sample_id))
    loss = seq2seq.sequence_loss(logits=decoder_logits_train,targets=decoder_targets, weights=mask)
    <tf.Tensor 'decoder/Identity:0' shape=(?, ?, 10) dtype=float32>
    <tf.Tensor 'minibatch/decoder_targets:0' shape=(?, ?) dtype=int32>
    <tf.Tensor 'decoder/decoder/transpose_1:0' shape=(?, ?) dtype=int32>
with tf.variable_scope('decoder',reuse=True):
    start_tokens = tf.ones([batch_size, ], tf.int32)*1  #[batch_size]  数值为1
    encoder_state = nest.map_structure(lambda s: seq2seq.tile_batch(s, 3),
                                                   encoder_final_state)
    inference_decoder = tf.contrib.seq2seq.BeamSearchDecoder(cell=decoder_cell, embedding=embedding,
                                                                             start_tokens=start_tokens,
                                                                             end_token=1,
                                                                             initial_state=encoder_state,
                                                                             beam_width=3,
                                                                             output_layer=output_layer)
    beam_decoder_outputs, _, _ = seq2seq.dynamic_decode(decoder=inference_decoder,maximum_iterations=10)
train_op = tf.train.AdamOptimizer(learning_rate = 0.001).minimize(loss)
sess.run(tf.global_variables_initializer())
def next_feed():
    batch = next(batches)
    
    encoder_inputs_, encoder_inputs_length_ = data_helpers.batch(batch)
    decoder_targets_, decoder_targets_length_ = data_helpers.batch(
        [(sequence) + [EOS] for sequence in batch]
    )
    decoder_inputs_, decoder_inputs_length_ = data_helpers.batch(
        [[EOS] + (sequence) for sequence in batch]
    )
    
    # 在feedDict里面,key可以是一个Tensor
    return {
        encoder_inputs: encoder_inputs_.T,
        decoder_inputs: decoder_inputs_.T,
        decoder_targets: decoder_targets_.T,
        encoder_inputs_length: encoder_inputs_length_,
        decoder_inputs_length: decoder_inputs_length_
    }

x = next_feed()
print('encoder_inputs:')
print(x[encoder_inputs][0,:])
print('encoder_inputs_length:')
print(x[encoder_inputs_length][0])
print('decoder_inputs:')
print(x[decoder_inputs][0,:])
print('decoder_inputs_length:')
print(x[decoder_inputs_length][0])
print('decoder_targets:')
print(x[decoder_targets][0,:])
encoder_inputs:
[3 3 7 3 0 0 0 0]
encoder_inputs_length:
4
decoder_inputs:
[1 3 3 7 3 0 0 0 0]
decoder_inputs_length:
5
decoder_targets:
[3 3 7 3 1 0 0 0 0]
loss_track = []
max_batches = 6001
batches_in_epoch = 200

try:
    # 一个epoch的learning
    for batch in range(max_batches):
        fd = next_feed()
        _, l = sess.run([train_op, loss], fd)
        loss_track.append(l)
        
        if batch == 0 or batch % batches_in_epoch == 0:
            print('batch {}'.format(batch))
            print('  minibatch loss: {}'.format(sess.run(loss, fd)))
            predict_ = sess.run(beam_decoder_outputs.predicted_ids, fd)
            #print(predict_)
            for i, (inp, pred) in enumerate(zip(fd[encoder_inputs], predict_)):
                print('  sample {}:'.format(i + 1))
                print('    input     > {}'.format(inp))
                print('    predicted > {}'.format(pred))
                if i >= 2:
                    break
            print()
        
except KeyboardInterrupt:
    print('training interrupted')
batch 0
  minibatch loss: 2.2935664653778076
  sample 1:
    input     > [9 2 8 0 0 0 0 0]
    predicted > [[5 5 5]
 [5 5 5]
 [5 5 5]
 [5 5 8]
 [8 5 8]
 [8 8 8]
 [8 8 8]
 [8 8 8]
 [8 8 8]
 [8 8 8]]
  sample 2:
    input     > [6 8 9 7 2 6 9 3]
    predicted > [[5 5 5]
 [5 5 5]
 [5 5 8]
 [8 8 8]
 [8 8 8]
 [8 8 8]
 [8 8 8]
 [8 8 8]
 [8 8 8]
 [8 9 8]]
  sample 3:
    input     > [3 6 6 0 0 0 0 0]
    predicted > [[5 5 5]
 [5 5 5]
 [5 5 8]
 [8 8 8]
 [8 8 8]
 [8 8 8]
 [8 8 8]
 [8 8 8]
 [8 8 8]
 [8 9 8]]

batch 200
  minibatch loss: 1.4949365854263306
  sample 1:
    input     > [4 7 8 6 7 9 0 0]
    predicted > [[ 3  3  4]
 [ 4  4  3]
 [ 5  5  5]
 [ 7  5  5]
 [ 4  9  9]
 [ 9  4  4]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [9 3 7 4 0 0 0 0]
    predicted > [[ 4  4  4]
 [ 9  9  9]
 [ 5  4  9]
 [ 4  4  4]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 3:
    input     > [2 5 9 6 0 0 0 0]
    predicted > [[ 9  6  9]
 [ 6  9  6]
 [ 6  2  6]
 [ 1  1  2]
 [-1 -1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]

batch 400
  minibatch loss: 1.2325794696807861
  sample 1:
    input     > [6 7 2 9 0 0 0 0]
    predicted > [[ 6  6  6]
 [ 6  2  2]
 [ 2  4  4]
 [ 4  6  9]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [8 7 3 3 3 2 9 0]
    predicted > [[8 8 8]
 [3 3 3]
 [3 3 3]
 [2 2 2]
 [3 5 5]
 [5 2 2]
 [9 5 9]
 [1 1 1]]
  sample 3:
    input     > [6 4 2 4 3 7 2 0]
    predicted > [[4 4 4]
 [2 2 2]
 [7 2 2]
 [2 7 7]
 [4 4 4]
 [6 7 7]
 [4 6 2]
 [1 1 1]]

batch 600
  minibatch loss: 0.9292899370193481
  sample 1:
    input     > [4 9 5 9 9 2 0 0]
    predicted > [[ 9  9  9]
 [ 4  4  4]
 [ 5  5  4]
 [ 9  9  9]
 [ 2  2  7]
 [ 4  5  2]
 [ 1  1  1]
 [-1 -1 -1]]
  sample 2:
    input     > [8 2 6 4 7 0 0 0]
    predicted > [[ 7  7  4]
 [ 2  2  2]
 [ 4  4  7]
 [ 6  6  6]
 [ 4  5  8]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 3:
    input     > [7 9 7 9 6 0 0 0]
    predicted > [[ 9  9  9]
 [ 7  7  7]
 [ 7  7  7]
 [ 6  9  9]
 [ 9  7  6]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]]

batch 800
  minibatch loss: 0.7363898754119873
  sample 1:
    input     > [9 2 6 0 0 0 0 0]
    predicted > [[ 9  2  9]
 [ 2  9  6]
 [ 6  6  2]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [6 8 7 9 6 3 2 0]
    predicted > [[ 6  6  6]
 [ 5  8  5]
 [ 3  6  6]
 [ 6  5  3]
 [ 9  6  3]
 [ 7  3  9]
 [ 3  2  7]
 [ 1  1  1]
 [-1 -1 -1]]
  sample 3:
    input     > [9 2 8 4 9 6 9 3]
    predicted > [[9 9 9]
 [3 2 3]
 [9 9 9]
 [2 8 4]
 [4 4 2]
 [9 6 9]
 [7 9 7]
 [8 7 8]
 [1 1 1]]

batch 1000
  minibatch loss: 0.7347214221954346
  sample 1:
    input     > [3 3 8 0 0 0 0 0]
    predicted > [[ 3  3  3]
 [ 3  3  8]
 [ 8  3  6]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [3 8 4 9 6 3 5 4]
    predicted > [[ 3  3  3]
 [ 3  3  3]
 [ 4  4  4]
 [ 5  5  5]
 [ 9  6  6]
 [ 5  9  9]
 [ 6  4  4]
 [ 1  1  8]
 [-1 -1  1]]
  sample 3:
    input     > [3 4 7 0 0 0 0 0]
    predicted > [[ 3  4  7]
 [ 4  3  4]
 [ 7  7  3]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]

batch 1200
  minibatch loss: 0.43508097529411316
  sample 1:
    input     > [5 5 5 4 4 4 4 0]
    predicted > [[ 5  5  5]
 [ 5  5  4]
 [ 4  4  5]
 [ 4  5  5]
 [ 5  4  4]
 [ 5  4  5]
 [ 1  5  1]
 [-1  1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [2 7 3 0 0 0 0 0]
    predicted > [[ 2  7  2]
 [ 7  2  7]
 [ 3  3  6]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 3:
    input     > [2 2 8 0 0 0 0 0]
    predicted > [[ 2  2  2]
 [ 2  2  8]
 [ 8  5  2]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]

batch 1400
  minibatch loss: 0.41912826895713806
  sample 1:
    input     > [7 8 5 3 2 0 0 0]
    predicted > [[ 5  7  8]
 [ 7  8  7]
 [ 8  5  5]
 [ 3  3  3]
 [ 2  2  2]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [8 7 6 9 2 0 0 0]
    predicted > [[ 8  7  8]
 [ 7  8  6]
 [ 6  9  7]
 [ 9  6  2]
 [ 2  2  9]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 3:
    input     > [7 8 3 8 5 8 7 6]
    predicted > [[8 8 8]
 [7 8 7]
 [8 7 8]
 [3 3 3]
 [7 7 7]
 [8 8 5]
 [6 6 3]
 [5 5 5]
 [1 1 1]]

batch 1600
  minibatch loss: 0.3989475965499878
  sample 1:
    input     > [3 7 4 4 7 2 0 0]
    predicted > [[ 3  7  7]
 [ 7  3  4]
 [ 4  4  3]
 [ 4  4  3]
 [ 7  3  4]
 [ 2  2  2]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [5 7 6 4 2 4 9 5]
    predicted > [[5 5 5]
 [7 7 7]
 [6 6 6]
 [4 4 4]
 [2 2 2]
 [5 4 9]
 [9 9 5]
 [4 5 4]
 [1 1 1]]
  sample 3:
    input     > [7 6 4 0 0 0 0 0]
    predicted > [[ 7  6  7]
 [ 6  7  4]
 [ 4  4  6]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]

batch 1800
  minibatch loss: 0.31475651264190674
  sample 1:
    input     > [8 5 9 9 9 4 0 0]
    predicted > [[ 8  8  8]
 [ 5  9  9]
 [ 9  5  5]
 [ 9  4  9]
 [ 4  9  4]
 [ 9  9  9]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [9 6 6 0 0 0 0 0]
    predicted > [[ 9  6  6]
 [ 6  9  9]
 [ 6  6  9]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 3:
    input     > [7 3 6 3 3 7 0 0]
    predicted > [[ 3  7  7]
 [ 7  3  3]
 [ 6  6  3]
 [ 3  3  6]
 [ 7  3  7]
 [ 3  7  3]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]]

batch 2000
  minibatch loss: 0.41449815034866333
  sample 1:
    input     > [6 6 8 0 0 0 0 0]
    predicted > [[ 6  6  6]
 [ 6  9  3]
 [ 8  7  9]
 [ 1  1  7]
 [-1 -1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [8 8 5 6 2 9 2 3]
    predicted > [[8 8 8]
 [8 8 8]
 [5 5 6]
 [6 6 5]
 [2 2 2]
 [9 9 9]
 [2 3 2]
 [3 2 3]
 [1 1 1]]
  sample 3:
    input     > [5 2 3 2 4 7 7 6]
    predicted > [[2 5 2]
 [5 2 5]
 [4 2 4]
 [3 3 3]
 [2 5 2]
 [7 7 7]
 [7 4 6]
 [6 6 5]
 [1 1 1]]

batch 2200
  minibatch loss: 0.2028750777244568
  sample 1:
    input     > [2 5 6 8 6 7 7]
    predicted > [[2 2 2]
 [5 9 6]
 [6 7 5]
 [7 5 8]
 [9 7 7]
 [7 6 6]
 [8 5 7]
 [1 1 1]]
  sample 2:
    input     > [9 3 3 0 0 0 0]
    predicted > [[ 9  9  3]
 [ 3  3  9]
 [ 3  6  9]
 [ 1  1  3]
 [-1 -1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 3:
    input     > [7 9 9 0 0 0 0]
    predicted > [[ 7  8  6]
 [ 9  6  8]
 [ 9  9  9]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]

batch 2400
  minibatch loss: 0.17885658144950867
  sample 1:
    input     > [3 6 8 3 2 0 0 0]
    predicted > [[ 3  6  3]
 [ 6  3  6]
 [ 8  8  8]
 [ 3  3  3]
 [ 2  3  3]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [4 5 4 6 5 5 0 0]
    predicted > [[ 4  4  4]
 [ 5  5  5]
 [ 4  4  4]
 [ 5  6  5]
 [ 6  5  6]
 [ 5  5  4]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 3:
    input     > [9 7 2 3 3 0 0 0]
    predicted > [[ 9  9  9]
 [ 7  7  7]
 [ 3  2  3]
 [ 2  3  2]
 [ 3  3  6]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]

batch 2600
  minibatch loss: 0.20247018337249756
  sample 1:
    input     > [3 7 9 0 0 0 0 0]
    predicted > [[ 3  7  3]
 [ 7  3  5]
 [ 9  9  2]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [2 5 2 5 8 0 0 0]
    predicted > [[ 2  2  2]
 [ 5  5  5]
 [ 2  2  2]
 [ 5  5  5]
 [ 8  5  3]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 3:
    input     > [5 7 3 7 0 0 0 0]
    predicted > [[ 5  5  7]
 [ 7  7  5]
 [ 3  7  5]
 [ 7  3  3]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]

batch 2800
  minibatch loss: 0.24160973727703094
  sample 1:
    input     > [8 2 2 8 4 0 0 0]
    predicted > [[ 8  2  8]
 [ 2  8  2]
 [ 2  8  2]
 [ 8  2  4]
 [ 4  4  8]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [3 3 8 7 0 0 0 0]
    predicted > [[ 3  3  3]
 [ 3  3  8]
 [ 8  7  3]
 [ 7  8  7]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 3:
    input     > [5 5 8 7 3 7 8 5]
    predicted > [[5 5 5]
 [5 5 5]
 [8 8 8]
 [7 7 7]
 [3 3 7]
 [7 7 3]
 [8 5 8]
 [5 8 5]
 [1 1 1]]

batch 3000
  minibatch loss: 0.23292377591133118
  sample 1:
    input     > [4 4 2 7 0 0 0 0]
    predicted > [[ 4  4  4]
 [ 4  4  2]
 [ 2  7  4]
 [ 7  2  7]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [9 5 3 4 8 7 6 9]
    predicted > [[9 9 9]
 [5 5 5]
 [3 3 8]
 [4 4 3]
 [8 8 4]
 [7 7 6]
 [9 6 7]
 [6 9 9]
 [1 1 1]]
  sample 3:
    input     > [5 5 2 4 2 0 0 0]
    predicted > [[ 5  5  5]
 [ 5  5  5]
 [ 2  4  2]
 [ 4  2  2]
 [ 2  2  4]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]

batch 3200
  minibatch loss: 0.13823337852954865
  sample 1:
    input     > [3 3 7 8 2 3 7 2]
    predicted > [[3 3 3]
 [3 3 3]
 [7 8 8]
 [8 7 7]
 [2 2 2]
 [3 3 3]
 [7 7 2]
 [2 2 7]
 [1 1 1]]
  sample 2:
    input     > [6 2 7 3 5 4 7 2]
    predicted > [[6 6 6]
 [2 2 2]
 [7 7 7]
 [3 3 5]
 [4 5 3]
 [5 4 2]
 [7 2 4]
 [2 7 7]
 [1 1 1]]
  sample 3:
    input     > [2 2 7 7 2 0 0 0]
    predicted > [[ 2  2  2]
 [ 2  7  2]
 [ 7  2  7]
 [ 7  2  2]
 [ 2  7  7]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]

batch 3400
  minibatch loss: 0.118137888610363
  sample 1:
    input     > [5 5 7 7 6 0 0 0]
    predicted > [[ 5  5  5]
 [ 5  7  7]
 [ 7  5  5]
 [ 7  5  5]
 [ 6  7  6]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [8 2 4 5 0 0 0 0]
    predicted > [[ 8  4  8]
 [ 2  8  2]
 [ 4  2  4]
 [ 5  5  4]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 3:
    input     > [3 5 2 4 0 0 0 0]
    predicted > [[ 3  3  3]
 [ 5  5  5]
 [ 2  2  2]
 [ 4  5  8]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]

batch 3600
  minibatch loss: 0.18091285228729248
  sample 1:
    input     > [9 2 3 2 7 6 6 3]
    predicted > [[9 9 9]
 [2 2 2]
 [3 2 3]
 [2 3 2]
 [6 7 7]
 [7 6 6]
 [6 6 6]
 [3 3 3]
 [1 1 1]]
  sample 2:
    input     > [9 7 6 8 5 3 0 0]
    predicted > [[ 9  9  9]
 [ 7  7  7]
 [ 6  6  9]
 [ 8  8  7]
 [ 5  5  3]
 [ 3  7  5]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 3:
    input     > [3 4 4 9 8 8 8 5]
    predicted > [[3 3 3]
 [4 4 4]
 [4 4 9]
 [9 8 8]
 [8 9 4]
 [8 5 8]
 [5 8 5]
 [8 8 5]
 [1 1 1]]

batch 3800
  minibatch loss: 0.1578817516565323
  sample 1:
    input     > [4 9 4 5 4 3 9 0]
    predicted > [[ 4  4  4]
 [ 4  9  4]
 [ 9  4  9]
 [ 5  5  5]
 [ 9  4  9]
 [ 8  3  3]
 [ 3  9  4]
 [ 1  1  1]
 [-1 -1 -1]]
  sample 2:
    input     > [8 3 3 4 0 0 0 0]
    predicted > [[ 8  3  3]
 [ 3  8  8]
 [ 3  8  8]
 [ 4  3  4]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 3:
    input     > [5 4 7 9 5 0 0 0]
    predicted > [[ 5  5  5]
 [ 4  4  7]
 [ 7  5  4]
 [ 9  7  9]
 [ 5  9  4]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]

batch 4000
  minibatch loss: 0.21402882039546967
  sample 1:
    input     > [2 4 9 4 4 3 2 0]
    predicted > [[ 4  2  4]
 [ 2  4  2]
 [ 9  9  9]
 [ 4  4  4]
 [ 2  4  2]
 [ 8  3  8]
 [ 4  2  1]
 [ 1  1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [7 8 5 0 0 0 0 0]
    predicted > [[ 7  7  7]
 [ 8  8  8]
 [ 5  8  9]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 3:
    input     > [6 6 2 0 0 0 0 0]
    predicted > [[ 6  6  6]
 [ 6  2  6]
 [ 2  6  4]
 [ 1  1  2]
 [-1 -1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]

batch 4200
  minibatch loss: 0.07165724784135818
  sample 1:
    input     > [5 8 2 6 0 0 0 0]
    predicted > [[ 5  8  8]
 [ 8  5  5]
 [ 2  6  2]
 [ 6  2  6]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [4 3 9 8 7 3 9 2]
    predicted > [[4 4 4]
 [3 3 3]
 [9 9 9]
 [8 7 8]
 [7 8 7]
 [3 3 3]
 [9 9 2]
 [2 3 9]
 [1 1 1]]
  sample 3:
    input     > [4 2 3 8 2 0 0 0]
    predicted > [[ 4  2  4]
 [ 2  4  3]
 [ 3  8  2]
 [ 8  3  2]
 [ 2  2  8]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]

batch 4400
  minibatch loss: 0.08584733307361603
  sample 1:
    input     > [5 6 4 5 2 5 5]
    predicted > [[5 5 5]
 [6 6 6]
 [4 4 4]
 [5 5 5]
 [2 6 2]
 [5 2 5]
 [5 5 6]
 [1 1 1]]
  sample 2:
    input     > [5 5 8 7 9 3 0]
    predicted > [[ 5  5  5]
 [ 5  5  5]
 [ 8  8  8]
 [ 9  7  6]
 [ 7  9  8]
 [ 3  3  7]
 [ 1  1  1]
 [-1 -1 -1]]
  sample 3:
    input     > [9 5 9 2 5 7 3]
    predicted > [[9 9 9]
 [5 5 5]
 [9 9 9]
 [2 2 2]
 [5 7 5]
 [7 5 7]
 [3 3 7]
 [1 1 1]]

batch 4600
  minibatch loss: 0.08049434423446655
  sample 1:
    input     > [3 9 4 0 0 0 0 0]
    predicted > [[ 3  3  9]
 [ 9  4  3]
 [ 4  9  4]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [7 8 9 4 3 5 2 3]
    predicted > [[7 7 8]
 [8 8 7]
 [9 9 9]
 [4 4 4]
 [3 3 3]
 [5 7 5]
 [2 9 2]
 [3 3 3]
 [1 1 1]]
  sample 3:
    input     > [9 3 6 4 4 6 5 9]
    predicted > [[9 9 9]
 [3 3 3]
 [6 4 6]
 [4 6 4]
 [4 6 4]
 [6 5 5]
 [5 4 6]
 [9 9 6]
 [1 1 1]]

batch 4800
  minibatch loss: 0.037724826484918594
  sample 1:
    input     > [5 6 8 2 5 0 0]
    predicted > [[ 5  6  6]
 [ 6  5  5]
 [ 8  8  8]
 [ 2  2  5]
 [ 5  5  2]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [7 9 5 0 0 0 0]
    predicted > [[ 7  5  5]
 [ 9  7  2]
 [ 5  9  7]
 [ 1  1  9]
 [-1 -1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 3:
    input     > [5 3 3 6 0 0 0]
    predicted > [[ 5  3  3]
 [ 3  5  5]
 [ 3  5  3]
 [ 6  3  5]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]

batch 5000
  minibatch loss: 0.12354864180088043
  sample 1:
    input     > [4 6 9 6 5 7 8 9]
    predicted > [[4 4 4]
 [6 6 6]
 [9 9 6]
 [6 6 9]
 [5 5 5]
 [7 7 8]
 [9 8 6]
 [8 9 7]
 [1 1 1]]
  sample 2:
    input     > [6 5 9 9 8 0 0 0]
    predicted > [[ 6  6  6]
 [ 5  5  9]
 [ 9  9  5]
 [ 9  8  8]
 [ 8  9  9]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 3:
    input     > [6 7 2 8 9 7 0 0]
    predicted > [[ 6  6  6]
 [ 7  7  7]
 [ 2  2  2]
 [ 8  8  8]
 [ 9  6  7]
 [ 7  8  9]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]]

batch 5200
  minibatch loss: 0.05009409785270691
  sample 1:
    input     > [6 3 8 7 0 0 0]
    predicted > [[ 6  6  6]
 [ 3  8  8]
 [ 8  3  3]
 [ 7  7  6]
 [ 1  1  7]
 [-1 -1  1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [9 2 5 9 0 0 0]
    predicted > [[ 9  2  9]
 [ 2  9  5]
 [ 5  5  2]
 [ 9  9  9]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 3:
    input     > [3 5 2 7 7 6 0]
    predicted > [[ 3  5  3]
 [ 5  3  5]
 [ 2  2  7]
 [ 7  7  2]
 [ 7  6  2]
 [ 6  7  5]
 [ 1  1  1]
 [-1 -1 -1]]

batch 5400
  minibatch loss: 0.09247519075870514
  sample 1:
    input     > [8 6 5 3 8 7 4 2]
    predicted > [[8 8 8]
 [6 6 6]
 [5 5 5]
 [3 8 8]
 [8 3 3]
 [7 7 2]
 [4 2 7]
 [2 4 4]
 [1 1 1]]
  sample 2:
    input     > [8 6 3 9 4 7 5 0]
    predicted > [[ 8  3  3]
 [ 6  8  8]
 [ 3  9  9]
 [ 9  6  6]
 [ 4  4  7]
 [ 7  7  4]
 [ 5  7  4]
 [ 1  1  1]
 [-1 -1 -1]]
  sample 3:
    input     > [5 3 8 8 6 6 3 0]
    predicted > [[ 5  5  5]
 [ 8  3  8]
 [ 3  8  3]
 [ 6  8  6]
 [ 8  6  3]
 [ 3  6  8]
 [ 6  3  9]
 [ 1  1  1]
 [-1 -1 -1]]

batch 5600
  minibatch loss: 0.05249354988336563
  sample 1:
    input     > [5 9 5 0 0 0 0 0]
    predicted > [[ 5  5  5]
 [ 9  9  5]
 [ 5  8  9]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [9 6 6 7 3 6 5 6]
    predicted > [[ 9  9  9]
 [ 6  6  6]
 [ 6  6  6]
 [ 7  3  7]
 [ 3  7  3]
 [ 6  6  6]
 [ 5  5  5]
 [ 6  6  1]
 [ 1  1 -1]]
  sample 3:
    input     > [6 3 9 5 9 9 3 0]
    predicted > [[ 6  3  3]
 [ 3  6  6]
 [ 9  9  9]
 [ 5  5  5]
 [ 9  9  9]
 [ 9  9  9]
 [ 3  3  8]
 [ 1  1  1]
 [-1 -1 -1]]

batch 5800
  minibatch loss: 0.08289551734924316
  sample 1:
    input     > [8 9 3 5 2 0 0 0]
    predicted > [[ 8  8  9]
 [ 9  9  8]
 [ 3  5  3]
 [ 5  3  5]
 [ 2  2  2]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [7 4 8 3 2 3 4 9]
    predicted > [[ 7  7  7]
 [ 4  4  4]
 [ 8  8  3]
 [ 3  3  8]
 [ 2  2  2]
 [ 3  3  4]
 [ 4  9  3]
 [ 9  4  9]
 [ 1  1  6]
 [-1 -1  1]]
  sample 3:
    input     > [2 3 6 6 0 0 0 0]
    predicted > [[ 2  2  3]
 [ 3  3  2]
 [ 6  6  6]
 [ 6  9  6]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]

batch 6000
  minibatch loss: 0.03706203028559685
  sample 1:
    input     > [3 5 3 9 0 0 0 0]
    predicted > [[ 3  3  5]
 [ 5  3  3]
 [ 3  5  3]
 [ 9  9  9]
 [ 1  1  1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]
 [-1 -1 -1]]
  sample 2:
    input     > [8 2 9 6 2 8 2 3]
    predicted > [[8 8 8]
 [2 2 9]
 [9 9 2]
 [6 6 2]
 [2 2 6]
 [8 8 8]
 [3 2 3]
 [2 3 9]
 [1 1 1]]
  sample 3:
    input     > [6 4 8 4 9 7 4 0]
    predicted > [[ 6  6  6]
 [ 4  4  4]
 [ 8  8  4]
 [ 4  9  8]
 [ 9  4  8]
 [ 7  7  9]
 [ 4  4  6]
 [ 1  1  1]
 [-1 -1 -1]]
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot(loss_track)
print('loss {:.4f} after {} examples (batch_size={})'.format(loss_track[-1], 
                                                             len(loss_track)*batch_size, batch_size))
loss 0.0375 after 60010 examples (batch_size=10)

[图片上传失败...(image-2e6574-1544602821556)]


©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 214,875评论 6 496
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,569评论 3 389
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 160,475评论 0 350
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,459评论 1 288
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,537评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,563评论 1 293
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,580评论 3 414
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,326评论 0 270
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,773评论 1 307
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,086评论 2 330
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,252评论 1 343
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,921评论 5 338
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,566评论 3 322
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,190评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,435评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,129评论 2 366
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,125评论 2 352

推荐阅读更多精彩内容