Tensorflow笔记 4.2 学习率

概念

学习率

learning_rate:表示了每次参数更新的幅度大小。学习率过大,会导致待优化的参数在最小值附近波动,不收敛;学习率过小,会导致待优化的参数收敛缓慢。
在训练过程中,参数的更新向着损失函数梯度下降的方向。
参数的更新公式为:

w_{n+1} = w_n − learningrate*∇

学习率的设置

学习率过大,会导致待优化的参数在最小值附近波动,不收敛;学习率过小,会导致待优化的参数收敛缓慢。

指数衰减学习率:学习率随着训练轮数变化而动态更新

学习率计算公式如下:

Learning_rate=LEARNING_RATE_BASELEARNING_RATE_DECAY (global_step/LEARNING_RATE_BATCH_SIZE)

用 Tensorflow 的函数表示为:

global_step = tf.Variable(0, trainable=False)

learning_rate = tf.train.exponential_decay(
LEARNING_RATE_BASE,
global_step,
LEARNING_RATE_STEP, LEARNING_RATE_DECAY,
staircase=True/False)

其中, LEARNING_RATE_BASE 为学习率初始值, LEARNING_RATE_DECAY 为学习率衰减率,global_step 记录了当前训练轮数,为不可训练型参数。学习率 learning_rate 更新频率为输入数据集总样本数除以每次喂入样本数。若 staircase 设置为 True 时,表示 global_step/learning rate step 取整数,学习率阶梯型衰减;若 staircase 设置为 false 时,学习率会是一条平滑下降的曲线。

代码

#coding:utf-8
#设损失函数 loss-(w+1)^2,令w初值是常数10。反向传播就是求最优w,即求最小loss对应的w值。
#使用指数衰减的学习率,在迭代初期得到较高的下降速度,可以在较小的训练轮数下取的更有收敛度。

import tensorflow as tf

LEARN_RATE_BASE = 0.1 #最初学习率
LEARN_RATE_DECAY = 0.99 #学习率从衰减率
LEARN_RATE_STEP = 1 #喂入多少轮BATCH_SIZE后,更新一次学习率,一般设为:总样本数/BATCH_SIZE

#运行了几轮BATCH_SIZE的计数器,初值给0,设为不被训练。
global_step = tf.Variable(0, trainable=False)
#定义指数下降学习率
learning_rate = tf.train.exponential_decay(LEARN_RATE_BASE, global_step, LEARN_RATE_STEP, LEARN_RATE_DECAY, staircase=True)
#定义待优化参数,初值给10
w = tf.Variable(tf.constant(5, dtype=tf.float32))
#定义损失函数loss
loss = tf.square(w+1)
#定义反向传播方法
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)
#生成会话,训练40轮
with tf.Session() as sess:
    init_op=tf.global_variables_initializer()
    sess.run(init_op)
    for i in range (40):
        sess.run(train_step)
        learning_rate_val = sess.run(learning_rate)
        global_step_val = sess.run(global_step)
        w_val = sess.run(w)
        loss_val = sess.run(loss)
        print("After %s steps: global step is %f, w is %f, learning rate is %f, loss is %f\n" % (i, global_step_val, w_val, learning_rate_val, loss_val))

运行结果

After 0 steps: global step is 1.000000, w is 3.800000, learning rate is 0.099000, loss is 23.040001

After 1 steps: global step is 2.000000, w is 2.849600, learning rate is 0.098010, loss is 14.819419

After 2 steps: global step is 3.000000, w is 2.095001, learning rate is 0.097030, loss is 9.579033

After 3 steps: global step is 4.000000, w is 1.494386, learning rate is 0.096060, loss is 6.221961

After 4 steps: global step is 5.000000, w is 1.015167, learning rate is 0.095099, loss is 4.060896

After 5 steps: global step is 6.000000, w is 0.631886, learning rate is 0.094148, loss is 2.663051

After 6 steps: global step is 7.000000, w is 0.324608, learning rate is 0.093207, loss is 1.754587

After 7 steps: global step is 8.000000, w is 0.077684, learning rate is 0.092274, loss is 1.161403

After 8 steps: global step is 9.000000, w is -0.121202, learning rate is 0.091352, loss is 0.772287

After 9 steps: global step is 10.000000, w is -0.281761, learning rate is 0.090438, loss is 0.515867

After 10 steps: global step is 11.000000, w is -0.411674, learning rate is 0.089534, loss is 0.346128

After 11 steps: global step is 12.000000, w is -0.517024, learning rate is 0.088638, loss is 0.233266

After 12 steps: global step is 13.000000, w is -0.602644, learning rate is 0.087752, loss is 0.157891

After 13 steps: global step is 14.000000, w is -0.672382, learning rate is 0.086875, loss is 0.107334

After 14 steps: global step is 15.000000, w is -0.729305, learning rate is 0.086006, loss is 0.073276

After 15 steps: global step is 16.000000, w is -0.775868, learning rate is 0.085146, loss is 0.050235

After 16 steps: global step is 17.000000, w is -0.814036, learning rate is 0.084294, loss is 0.034583

After 17 steps: global step is 18.000000, w is -0.845387, learning rate is 0.083451, loss is 0.023905

After 18 steps: global step is 19.000000, w is -0.871193, learning rate is 0.082617, loss is 0.016591

After 19 steps: global step is 20.000000, w is -0.892476, learning rate is 0.081791, loss is 0.011561

After 20 steps: global step is 21.000000, w is -0.910065, learning rate is 0.080973, loss is 0.008088

After 21 steps: global step is 22.000000, w is -0.924629, learning rate is 0.080163, loss is 0.005681

After 22 steps: global step is 23.000000, w is -0.936713, learning rate is 0.079361, loss is 0.004005

After 23 steps: global step is 24.000000, w is -0.946758, learning rate is 0.078568, loss is 0.002835

After 24 steps: global step is 25.000000, w is -0.955125, learning rate is 0.077782, loss is 0.002014

After 25 steps: global step is 26.000000, w is -0.962106, learning rate is 0.077004, loss is 0.001436

After 26 steps: global step is 27.000000, w is -0.967942, learning rate is 0.076234, loss is 0.001028

After 27 steps: global step is 28.000000, w is -0.972830, learning rate is 0.075472, loss is 0.000738

After 28 steps: global step is 29.000000, w is -0.976931, learning rate is 0.074717, loss is 0.000532

After 29 steps: global step is 30.000000, w is -0.980378, learning rate is 0.073970, loss is 0.000385

After 30 steps: global step is 31.000000, w is -0.983281, learning rate is 0.073230, loss is 0.000280

After 31 steps: global step is 32.000000, w is -0.985730, learning rate is 0.072498, loss is 0.000204

After 32 steps: global step is 33.000000, w is -0.987799, learning rate is 0.071773, loss is 0.000149

After 33 steps: global step is 34.000000, w is -0.989550, learning rate is 0.071055, loss is 0.000109

After 34 steps: global step is 35.000000, w is -0.991035, learning rate is 0.070345, loss is 0.000080

After 35 steps: global step is 36.000000, w is -0.992297, learning rate is 0.069641, loss is 0.000059

After 36 steps: global step is 37.000000, w is -0.993369, learning rate is 0.068945, loss is 0.000044

After 37 steps: global step is 38.000000, w is -0.994284, learning rate is 0.068255, loss is 0.000033

After 38 steps: global step is 39.000000, w is -0.995064, learning rate is 0.067573, loss is 0.000024

After 39 steps: global step is 40.000000, w is -0.995731, learning rate is 0.066897, loss is 0.000018
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 216,287评论 6 498
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 92,346评论 3 392
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 162,277评论 0 353
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 58,132评论 1 292
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 67,147评论 6 388
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 51,106评论 1 295
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,019评论 3 417
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,862评论 0 274
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 45,301评论 1 310
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,521评论 2 332
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,682评论 1 348
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,405评论 5 343
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,996评论 3 325
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,651评论 0 22
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,803评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,674评论 2 368
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,563评论 2 352

推荐阅读更多精彩内容