TensorFlow 实例mnist_softmax.py 解析 2

TensorFlow 基础知识

与一般的编程语言不同。为了高效的运行，TensorFlow提出了计算图的概念。计算图只是定义了变量以及相应的运算，但是不包含具体数据。只是在运行阶段再给计算图中的变量赋值并运算。这样做的好处是可以将复杂的运算放在Python外执行（可以调用高效的C代码实现）。实际上Python代码只是负责定义运算。

这样TensorFlow的程序一般分为两个部分：

构建计算图
运行计算图

TensorFlow的运算是基于张量Tensor来表示的，其常用的数据类型有tf.placeholder, tf.constant, tf.Variable这三种。其中tf.constant是常量，定以后值不发生改变。placeholder是占位符，使用其定义的变量要在运行时指定值，往往是数据和标签。而Variable是可变量，使用其定义的变量在运行时可以更改，往往是模型的参数。Variable变量在运行前需要初始化 。

构建计算图时，使用常用的数据类型定义变量，对这些变量可以做各种常用运算，直至得到最终的运算图。当然TensorFlow内也已经内置了很多定义好的运算图（内置函数）。可以通过调用TensorFlow内置函数的方式来使用这些运算图。

运行计算图时，需要首先创建一个session，然后调用session.run()来运行代码。

使用sess = tf.InteractiveSession() 可以创建一个可交互的session，而sess = tf.Session()则创建一个默认图。Session 对象在使用完后需要关闭以释放资源. 除了显式调用 close 外, 也可以使用 "with" 代码块来自动完成关闭动作。

# 任务完成, 关闭会话.
sess.close()

# 使用with代码块可以自动完成关闭操作
with tf.Session() as sess:
  result = sess.run([product])
  print result

mnist_softmax.py代码解析

有了上面的基础后，下面来看mnist_softmax.py中的main函数

整体代码如下：

def main(_):
  # Import data
  mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True)

  # Create the model
  x = tf.placeholder(tf.float32, [None, 784])
  W = tf.Variable(tf.zeros([784, 10]))
  b = tf.Variable(tf.zeros([10]))
  y = tf.matmul(x, W) + b

  # Define loss and optimizer
  y_ = tf.placeholder(tf.float32, [None, 10])

  # The raw formulation of cross-entropy,
  #
  #   tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(tf.nn.softmax(y)),
  #                                 reduction_indices=[1]))
  #
  # can be numerically unstable.
  #
  # So here we use tf.nn.softmax_cross_entropy_with_logits on the raw
  # outputs of 'y', and then average across the batch.
  cross_entropy = tf.reduce_mean(
      tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
  train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

  sess = tf.InteractiveSession()
  tf.global_variables_initializer().run()
  # Train
  for _ in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

  # Test trained model
  correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
  accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
  print(sess.run(accuracy, feed_dict={x: mnist.test.images,
                                      y_: mnist.test.labels}))

代码的前几句为

  # Create the model
  x = tf.placeholder(tf.float32, [None, 784])
  W = tf.Variable(tf.zeros([784, 10]))
  b = tf.Variable(tf.zeros([10]))
  y = tf.matmul(x, W) + b

在上面的代码中，计算图定义了x 为占位符，以后用于存放MNIST输入数据。W,b为变量，表示模型参数。而y通过上面变量的运算得到，定义了最简单的一个运算图。

接下来：

  # Define loss and optimizer
  y_ = tf.placeholder(tf.float32, [None, 10])

这里又定义了一个占位符类型的变量 y_，用于存放数据对应的标签。

然后：

  cross_entropy = tf.reduce_mean(
      tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))

此段代码通过几个TensorFlow内置的函数，定义了一个复杂的计算图cross_entropy。这个计算图描述的计算为模型输出y与真实标签_y的交叉熵。

注：

此段代码里的y并未经过softmax，事实上softmax的过程包含在内置函数实现里了。1

代码中的有一段注释
  # The raw formulation of cross-entropy,
  #
  #   tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(tf.nn.softmax(y)),
  #                                 reduction_indices=[1]))
  #
  # can be numerically unstable.
  #
  # So here we use tf.nn.softmax_cross_entropy_with_logits on the raw
  # outputs of 'y', and then average across the batch.
注释的意思是说，tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y)) 所做的事情其实可以大概解码成-tf.reduce_sum(y_ * tf.log(tf.nn.softmax(y)),reduction_indices=[1]) 。但是下面这种解码定义的交叉熵会出现数值不稳定的问题。所以使用了内置函数的实现方式。

最后，代码：

 train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

同样是定义了计算图train_step , 不过这个计算图更加复杂。它应该是描述了使用梯度下降法最小化cross_entropy的过程。

运行训练计算图

自此，训练阶段的计算图构建完毕，下面的代码运行计算图

  sess = tf.InteractiveSession()
  tf.global_variables_initializer().run()
  # Train
  for _ in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

这里sess = tf.InteractiveSession()创建了一个交互式session。

而由于计算图中含有 Variable 类型的变量，所以在运行前需要初始化。代码中调用tf.global_variables_initializer().run()来完成对全部变量的初始化操作。

在for循环中，batch_xs, batch_ys = mnist.train.next_batch(100)是提供一个batch的数据

sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})则运行计算图train_step，在运行计算图时，需要给其中的placeholder类型的参数赋值。

最后是测试模型效果的代码部分

  # Test trained model
  correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
  accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
  print(sess.run(accuracy, feed_dict={x: mnist.test.images,
                                      y_: mnist.test.labels}))

与训练部分的代码类似，前两行是构建测试计算图，最后一行代码是运行计算图。