Tensorflow快餐教程(12) - 用机器写莎士比亚的戏剧

高层框架:TFLearn和Keras

上一节我们学习了Tensorflow的高层API封装,可以通过简单的几步就生成一个DNN分类器来解决MNIST手写识别问题。

尽管Tensorflow也在不断推进Estimator API。但是,这并不是工具的全部。在Tensorflow官方的API方外,我们还有强大的工具,比如TFLearn和Keras。

这节我们就做一个武器库的展示,看看专门为Tensorflow做的高层框架TFLearn和跨Tensorflow和CNTK几种后端的Keras为我们做了哪些强大的功能封装。

机器来写莎士比亚的戏剧

之前我们简单介绍了强大的用于处理序列数据的RNN。RNN比起其它网络的重要优点是可以学习了序列数据之后进行自生成。
比如,学习《唐诗三百首》可以写诗,学习了Linux Kernel源代码就能写C代码(虽然基本上编译不过)。

我们首先来一个自动写莎士比亚戏剧的例子吧。
在看代码之前我先唠叨几句。深度学习对于数据量的要求还是比较高的,像训练自动生成的这种,一般得几百万到几千万量级的训练数据下才能有好的效果。比如只用几篇小说来训练肯定生成不知所云的小说。就算是人类也做不到只学几首诗就会写诗么。
另外一点就是,训练数据量上来了,对于时间和算力的要求也是指数级提高的。
比如我们用莎翁的戏剧来训练,虽然数据量也不是特别的大,也就16万多行,但是在CPU上训练的话也不是一两个小时能搞定的,大约是天为单位。
后面我们举图像或视频的例子,在CPU上训,论月也是并不意外的。

那么,这个需要训一天左右的例子,代码会多复杂呢?答案是核心代码不过10几行,总共加上数据处理和测试代码也不过50行左右。

from __future__ import absolute_import, division, print_function

import os
import pickle
from six.moves import urllib

import tflearn
from tflearn.data_utils import *

path = "shakespeare_input.txt"
char_idx_file = 'char_idx.pickle'

if not os.path.isfile(path):
    urllib.request.urlretrieve("https://raw.githubusercontent.com/tflearn/tflearn.github.io/master/resources/shakespeare_input.txt", path)

maxlen = 25

char_idx = None
if os.path.isfile(char_idx_file):
  print('Loading previous char_idx')
  char_idx = pickle.load(open(char_idx_file, 'rb'))

X, Y, char_idx = \
    textfile_to_semi_redundant_sequences(path, seq_maxlen=maxlen, redun_step=3,
                                         pre_defined_char_idx=char_idx)

pickle.dump(char_idx, open(char_idx_file,'wb'))

g = tflearn.input_data([None, maxlen, len(char_idx)])
g = tflearn.lstm(g, 512, return_seq=True)
g = tflearn.dropout(g, 0.5)
g = tflearn.lstm(g, 512, return_seq=True)
g = tflearn.dropout(g, 0.5)
g = tflearn.lstm(g, 512)
g = tflearn.dropout(g, 0.5)
g = tflearn.fully_connected(g, len(char_idx), activation='softmax')
g = tflearn.regression(g, optimizer='adam', loss='categorical_crossentropy',
                       learning_rate=0.001)

m = tflearn.SequenceGenerator(g, dictionary=char_idx,
                              seq_maxlen=maxlen,
                              clip_gradients=5.0,
                              checkpoint_path='model_shakespeare')

for i in range(50):
    seed = random_sequence_from_textfile(path, maxlen)
    m.fit(X, Y, validation_set=0.1, batch_size=128,
          n_epoch=1, run_id='shakespeare')
    print("-- TESTING...")
    print("-- Test with temperature of 1.0 --")
    print(m.generate(600, temperature=1.0, seq_seed=seed))
    print("-- Test with temperature of 0.5 --")
    print(m.generate(600, temperature=0.5, seq_seed=seed))

上面的例子需要使用TFLearn框架,可以通过

pip install tflearn

来安装。

TFLearn是专门为Tensorflow开发的高层次API框架。
用TFLearn API的主要好处是可读性更好,比如刚才的核心代码:

g = tflearn.input_data([None, maxlen, len(char_idx)])
g = tflearn.lstm(g, 512, return_seq=True)
g = tflearn.dropout(g, 0.5)
g = tflearn.lstm(g, 512, return_seq=True)
g = tflearn.dropout(g, 0.5)
g = tflearn.lstm(g, 512)
g = tflearn.dropout(g, 0.5)
g = tflearn.fully_connected(g, len(char_idx), activation='softmax')
g = tflearn.regression(g, optimizer='adam', loss='categorical_crossentropy',
                       learning_rate=0.001)

m = tflearn.SequenceGenerator(g, dictionary=char_idx,
                              seq_maxlen=maxlen,
                              clip_gradients=5.0,
                              checkpoint_path='model_shakespeare')

从输入数据,三层LSTM,三层Dropout,最后是一个softmax的全连接层。

我们再来看一个预测泰坦尼克号幸存概率的网络的结构:

# Build neural network
net = tflearn.input_data(shape=[None, 6])
net = tflearn.fully_connected(net, 32)
net = tflearn.fully_connected(net, 32)
net = tflearn.fully_connected(net, 2, activation='softmax')
net = tflearn.regression(net)

# Define model
model = tflearn.DNN(net)
# Start training (apply gradient descent algorithm)
model.fit(data, labels, n_epoch=10, batch_size=16, show_metric=True)

我们来看看训练第一轮之后生成的戏剧,这个阶段肯定还是有点惨,不过基本的意思有了:

THAISA:
Why, sir, say if becel; sunthy alot but of
coos rytermelt, buy -
bived with wond I saTt fas,'? You and grigper.

FIENDANS:
By my wordhand!

KING RECENTEN:
Wish sterest expeun The siops so his fuurs,
And emour so, ane stamn.
she wealiwe muke britgie; I dafs tpichicon, bist,
Turch ose be fast wirpest neerenler.

NONTo:
So befac, sels at, Blove and rackity;
The senent stran spard: and, this not you so the wount
hor hould batil's toor wate
What if a poostit's of bust contot;
Whit twetemes, Game ifon I am
Ures the fast to been'd matter:
To and lause. Tiess her jittarss,
Let concertaet ar: and not!
Not fearle her g

我们再看看训练10轮之后的结果:

PEMBROKE:
There tell the elder pieres,
Would our pestilent shapeing sebaricity. So have partned in me, Project of Yorle
again, and then when you set man
make plash'd of her too sparent
upon this father be dangerous puny or house;
Born is now been left of himself,
This true compary nor no stretches, back that
Horses had hand or question!

POLIXENES:
I have unproach the strangest
padely carry neerful young Yir,
Or hope not fall-a a cause of banque.

JESSICA:
He that comes to find the just,
And eyes gold, substrovious;
Yea pity a god on a foul rioness, these tebles and purish new head meet again?

20轮之后的结果:

y prison,
Fatal and ominous children and the foot, it will
hear with you: it is my pace comprite
To come my soldiers, if I were dread,
Of breath as what I charge with I well;
Her palace and every tailor, the house of wondrous sweet mark!

STANLEY:
Take that spirit, thou hast
'no whore he did eyes, and what men damned, and
I had evils; by lap, or so,
But wholow'st thy report subject,
Had my rabble against thee;
And no rassians which he secure
of genslications; when I have move undertake-inward, into Bertounce;
Upon a shift, meet as we are. He beggars thing
Have for it will, but joy with the minute cannot whom we prarem
-- Test with temperature of 0.5 --
y prison,
Fatal and ominous rein here,
The princess have all to prince and the marsh of his company
To prove brother in the world,
And we the forest prove more than the heavens on the false report of the fools,
Depose the body of my wits.

DUKE SENIOR:
The night better appelled my part.

ANGELO:
Care you that may you understand your grace
I may speak of a point, and seems as in the heart
Who be deeds to show and sale for so
unhouses me of her soul, and the heart of them from the corder black to stand about up.

CLAUDIO:
The place of the world shall be married his love.

从生成城市名字说起

大家的莎士比亚模型应该正在训练过程中吧,咱们闲着也是闲着,不如从一个更简单的例子来看看这个生成过程。
我们还是取TFLearn的官方例子,通过读取美国主要城市名字列表来生成一些新的城市名字。

我们以Z开头的城市为例:

Zachary
Zafra
Zag
Zahl
Zaleski
Zalma
Zama
Zanesfield
Zanesville
Zap
Zapata
Zarah
Zavalla
Zearing
Zebina
Zebulon
Zeeland
Zeigler
Zela
Zelienople
Zell
Zellwood
Zemple
Zena
Zenda
Zenith
Zephyr
Zephyr Cove
Zephyrhills
Zia Pueblo
Zillah
Zilwaukee
Zim
Zimmerman
Zinc
Zion
Zionsville
Zita
Zoar
Zolfo Springs
Zona
Zumbro Falls
Zumbrota
Zuni
Zurich
Zwingle
Zwolle

一共20580个城市。这个训练就快多了,在纯CPU上训练,大约5到6分钟可以训练一轮。

代码如下,跟上面写莎翁的戏剧的如出一辙:

from __future__ import absolute_import, division, print_function

import os
from six import moves
import ssl

import tflearn
from tflearn.data_utils import *

path = "US_Cities.txt"
if not os.path.isfile(path):
    context = ssl._create_unverified_context()
    moves.urllib.request.urlretrieve("https://raw.githubusercontent.com/tflearn/tflearn.github.io/master/resources/US_Cities.txt", path, context=context)

maxlen = 20

string_utf8 = open(path, "r").read().decode('utf-8')
X, Y, char_idx = \
    string_to_semi_redundant_sequences(string_utf8, seq_maxlen=maxlen, redun_step=3)

g = tflearn.input_data(shape=[None, maxlen, len(char_idx)])
g = tflearn.lstm(g, 512, return_seq=True)
g = tflearn.dropout(g, 0.5)
g = tflearn.lstm(g, 512)
g = tflearn.dropout(g, 0.5)
g = tflearn.fully_connected(g, len(char_idx), activation='softmax')
g = tflearn.regression(g, optimizer='adam', loss='categorical_crossentropy',
                       learning_rate=0.001)

m = tflearn.SequenceGenerator(g, dictionary=char_idx,
                              seq_maxlen=maxlen,
                              clip_gradients=5.0,
                              checkpoint_path='model_us_cities')

for i in range(40):
    seed = random_sequence_from_string(string_utf8, maxlen)
    m.fit(X, Y, validation_set=0.1, batch_size=128,
          n_epoch=1, run_id='us_cities')
    print("-- TESTING...")
    print("-- Test with temperature of 1.2 --")
    print(m.generate(30, temperature=1.2, seq_seed=seed).encode('utf-8'))
    print("-- Test with temperature of 1.0 --")
    print(m.generate(30, temperature=1.0, seq_seed=seed).encode('utf-8'))
    print("-- Test with temperature of 0.5 --")
    print(m.generate(30, temperature=0.5, seq_seed=seed).encode('utf-8'))

我们看下第一轮训练完生成的城市名:

t and Shoot
Cuthbertd
Lettfrecv
El
Ceoneel Sutd
Sa

第二轮:

stle
Finchford
Finch Dasthond
madloogd
Wlaycoyarfw

第三轮:

averal
Cape Carteret
Acbiropa Heowar Sor Dittoy
Do

第十轮:

hoenchen
Schofield
Stcojos
Schabell
StcaKnerum Cri

第二十轮,好像开始有点意思了:

Hill
Cherry Hills Village
Hillfood Pork
Hillbrook

第三十轮,又有点退化:

ckitat
Kline
Klondike
Klonsder
Klansburg
Dlandon
D

第四十轮:

Branch
Villages of Ocite
Sidaydaton
Sidway
Siddade

第100轮:

Atlasburg
Atmautluak
Attion
Attul
Atta
Aque Creek

tflearn.SequenceGenerator的好处和坏处都是将细节都封装起来了,我们难以看到它的背后发生了什么。

温度参数

其实在前面的结果中,我们只是节选了一种温度下的结果。输出的结果一般都输出几种温度的值。那么这个温度是什么意思呢?

温度是表征概念变化的量。如果温度高,比如大于1,就代表希望输出结果更稳定。稳定的结果就是可能每次生成的句子都是一样的。如果等于1,那么对结果没有影响。如果小于1,那么就会让每次生成的结果变化比较大。
像做诗这样比较浪漫的事情,我们一般希望温度值在0.1以下,有变化才好玩,不是吗?

下面是城市生成值,在三种不同温度下的生成结果:

-- Test with temperature of 1.2 --

Atlasburg
Atmautluak
Attion
Attul
Atta
Aque Creek
-- Test with temperature of 1.0 --

Atlasburg
Atmautluak
Attila
Attaville
Atteville
-- Test with temperature of 0.5 --

Atlasburg
Atmautluak
Attigua
Attinword
Attrove

跨后端高层API - Keras,生成尼采的文章

Keras是可以跨Tensorflow,微软的CNTK等多种后端的API。
可以通过

pip install keras

来安装keras。我们安装了Tensorflow之后,Keras会选用Tensorflow来做它的后端。

我们也看下Keras上文本生成的例子。官方例子是用来生成尼采的句子。

核心语句就6句话:

model = Sequential()
model.add(LSTM(128, input_shape=(maxlen, len(chars))))
model.add(Dense(len(chars)))
model.add(Activation('softmax'))

optimizer = RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

下面是完整的代码,大家跑来玩玩吧。如果对尼采不感兴趣,也可以换成别的文章。不过请注意,正如注释中所说的,文本随便换,但是要保持在10万字符以上。最好是100万字符以上。

'''Example script to generate text from Nietzsche's writings.
At least 20 epochs are required before the generated text
starts sounding coherent.
It is recommended to run this script on GPU, as recurrent
networks are quite computationally intensive.
If you try this script on new data, make sure your corpus
has at least ~100k characters. ~1M is better.
'''

from __future__ import print_function
from keras.callbacks import LambdaCallback
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.layers import LSTM
from keras.optimizers import RMSprop
from keras.utils.data_utils import get_file
import numpy as np
import random
import sys
import io

path = get_file('nietzsche.txt', origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt')
with io.open(path, encoding='utf-8') as f:
    text = f.read().lower()
print('corpus length:', len(text))

chars = sorted(list(set(text)))
print('total chars:', len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

# cut the text in semi-redundant sequences of maxlen characters
maxlen = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('nb sequences:', len(sentences))

print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1


# build the model: a single LSTM
print('Build model...')
model = Sequential()
model.add(LSTM(128, input_shape=(maxlen, len(chars))))
model.add(Dense(len(chars)))
model.add(Activation('softmax'))

optimizer = RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)


def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)


def on_epoch_end(epoch, logs):
    # Function invoked at end of each epoch. Prints generated text.
    print()
    print('----- Generating text after Epoch: %d' % epoch)

    start_index = random.randint(0, len(text) - maxlen - 1)
    for diversity in [0.2, 0.5, 1.0, 1.2]:
        print('----- diversity:', diversity)

        generated = ''
        sentence = text[start_index: start_index + maxlen]
        generated += sentence
        print('----- Generating with seed: "' + sentence + '"')
        sys.stdout.write(generated)

        for i in range(400):
            x_pred = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(sentence):
                x_pred[0, t, char_indices[char]] = 1.

            preds = model.predict(x_pred, verbose=0)[0]
            next_index = sample(preds, diversity)
            next_char = indices_char[next_index]

            generated += next_char
            sentence = sentence[1:] + next_char

            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()

print_callback = LambdaCallback(on_epoch_end=on_epoch_end)

model.fit(x, y,
          batch_size=128,
          epochs=60,
          callbacks=[print_callback])

文本生成背后的原理 - 只不过是概率的预测而己

TFLearn的封装做得太好,我们看不到细节。所以我们参看一下Keras的同功能实现的代码:

        for i in range(400):
            x_pred = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(sentence):
                x_pred[0, t, char_indices[char]] = 1.

            preds = model.predict(x_pred, verbose=0)[0]
            next_index = sample(preds, diversity)
            next_char = indices_char[next_index]

            generated += next_char
            sentence = sentence[1:] + next_char

            sys.stdout.write(next_char)
            sys.stdout.flush()

我们可以看到,本质上是调用model.predict来预测在当前序列下最可能出现的字符是什么。

preds = model.predict(x_pred, verbose=0)[0]

从sample的代码中,我们可以看到温度值背后的原理:

def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

了解了这些原理之后,使用不同的文本,加上不同的温度值,享受您的机器创作之旅吧!

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 204,684评论 6 478
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 87,143评论 2 381
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 151,214评论 0 337
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,788评论 1 277
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,796评论 5 368
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,665评论 1 281
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 38,027评论 3 399
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,679评论 0 258
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 41,346评论 1 299
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,664评论 2 321
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,766评论 1 331
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,412评论 4 321
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,015评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,974评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,203评论 1 260
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 45,073评论 2 350
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,501评论 2 343

推荐阅读更多精彩内容