基于tensorflow的正则化实现

基于tensorflow的正则化实现

前置条件

什么是正则化(regularization)

​ 如果用一句话解释:正则化就是通过增加权重惩罚(penalty)项到损失函数,让网络倾向于学习小一点的权重,从而达到抑制过拟合,增加模型泛化能力的效果。常见的正则化方法有L1正则化,L2正则化和Dropout正则化等。关于正则化的原理和作用请参考深度学习常用正则化方法

​ 本文以L2正则化进行实现,为了完整性,这个给出L2正则化的公式
L = L_0 + \frac{\lambda}{2}\sum_{i=1}^{n}{(w^2)}
​ 式中L_0是原始代价损失函数

\frac{\lambda}{2}\sum_{i=1}^{n}{(w^2)}L2正则化损失函数, 其中\lambda是权重因子,w为权重

tensorflow依赖函数

​ 在tensorflow 中,计算图(graph)通过集合(collection)来管理包括张量(tensor)、变量(variable)、资源

  • tf.add_to_collection

    将资源添加到特定的集合中

  • tf.get_collection

    从特定集合中取出对应的资源

  • 示例

    import tensorflow as tf
    
    # step 1 contruct variable
    v_0 = tf.Variable(tf.constant([1.0, 2.0, 3.0]), name="v_0")
    v_1 = tf.get_variable(shape=(), name="v_1")
    
    # step 2 add variable to collection
    tf.add_to_collection(name="variable", value=v_0)
    tf.add_to_collection(name="variable", value=v_1)
    
    init_op = tf.group(tf.global_variables_initializer(),
    tf.local_variables_initializer())
    with tf.Session() as sess:
        sess.run(init_op)
        # step 3 get variable from collection
        for var in tf.get_collection(key="variable"):
            print('{0}: {1}'.format(var.op.name, var.eval()))
    
  • 结果

    v_0: [ 1.  2.  3.]
    v_1: -1.4189265966415405
    

requirement enviroment

  • software: tensorflow==1.14.0
  • hardware: GTX 2060

正则化到底做了什么

理论计算

​ 这里假设处理条件:权重W[1.0, 2.0, 3.0], 正则化因子\lambda0.00004

​ 根据公式L2,正则化损失函数如下

weight\_loss=\frac{1}{2}*0.00004*(1.0^2+2.0^2+3.0^2)=0.00028

代码验证

​ 这里使用了三种方式计算L2正则化,前两种为tensorflow 接口,其中第一种为底层接口,第二种为更高级的接口,不过在tensorflow 2.0中已经抛弃了;第三种为根据公式自定义实现接口。代码如下

weight_decay = 0.00004  # 正则化权重因子
weight = tf.Variable(initial_value=tf.constant(value=[1.0, 2.0, 3.0])) # 权重

# use tensorflow interface
# method 1
weight_loss_1 = tf.nn.l2_loss(weight) * weight_decay
# method 2
weight_loss_2 = tf.contrib.layers.l2_regularizer(scale=weight_decay)(weight)

# cunstom 
# method 3
custom_weight_loss = tf.reduce_sum(tf.multiply(weight, weight))
custom_weight_loss = 1 / 2 * weight_decay * custom_weight_loss

init_op = tf.group(tf.global_variables_initializer(), 
                   tf.local_variables_initializer())
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
sess.run(init_op)

结果

method 1 result: 0.00028
method 2 result: 0.00028
method 3 result: 0.00028

可以看出运行结果于理论推导一致,接下来就是如何在网络中加入L2正则化,并完成训练。

在网络中引入L2正则化

​ 上述内容已经介绍了L2的基本概念和使用,接下来将介绍如何在神经网络的构建中引入L2正则化。L2正则化在神经网络中的使用主要包括三个步骤:

  • 计算权重的L2损失并添加到集合(collection)中

  • 分别取出集合中所有权重的L2损失值并相加

  • L2正则化损失函数与原始代价损失函数相加得到总的损失函数

第一步:三种方式收集权重损失函数

  • 使用f.nn.l2_loss()接口 与自定义collection 接口

    def get_weights_1(shape, weight_decay=0.0, dtype=tf.float32, trainable=True):
        """
        add weight regularization to loss collection
        Args:
            shape: 
            weight_decay: 
            dtype: 
            trainable: 
    
        Returns:
        """
        weight = tf.Variable(initial_value=tf.truncated_normal(shape=shape, stddev=0.01),                          name='Weights', dtype=dtype, trainable=trainable)
        if weight_decay > 0:
            weight_loss = tf.nn.l2_loss(weight) * weight_decay
            # weight_loss = tf.nn.l2_loss(weight, name="weight_loss")
            # tf.add_to_collection(tf.GraphKeys.LOSSES, value=weight_loss)
            tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, value=weight_loss)
        else:
            pass
        return weight
    

    分为两步:

    1. 计算正则化损失

      weight_loss = tf.nn.l2_loss(weight) * weight_decay
      
    2. 将正则化损失添加到特定集合中(这里直接添加到tensorflow内置集合,也可以添加到自定义集合)

      tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, value=weight_loss)
      
  • 使用 tf.contrib.layers.l2_regularizer 与自定义collection 接口

    def get_weights_2(shape, weight_decay=0.0, dtype=tf.float32, trainable=True):
        """
    
        Args:
            shape:
            weight_decay:
            dtype:
            trainable:
        Returns:
    
        """
        weight = tf.Variable(initial_value=tf.truncated_normal(shape=shape, stddev=0.01),                          name='Weights', dtype=dtype, trainable=trainable)
        if weight_decay > 0:
            weight_loss = tf.contrib.layers.l2_regularizer(weight_decay)(weight)
            # weight_loss = tf.nn.l2_loss(weight, name="weight_loss")
            # tf.add_to_collection("weight_loss", value=weight_loss)
            tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, value=weight_loss)
        else:
            pass
        return weight
    

    分为两步:

    1. 计算正则化损失

      weight_loss = tf.contrib.layers.l2_regularizer(weight_decay)(weight)
      
    2. 将正则化损失添加到特定集合中(这里直接添加到tensorflow内置集合,也可以添加到自定义集合)

      tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, value=weight_loss)
      
  • 使用tf.contrib.layers.l2_regularizer 与 tf.get_variable接口

    def get_weights_3(shape, weight_decay=0.0, dtype=tf.float32, trainable=True):
        """
        add weight to tf.get_variable
        Args:
            shape:
            weight_decay:
            dtype:
            trainable:
        Returns:
    
        """
        # create regularizer
        if weight_decay > 0:
            regularizer= tf.contrib.layers.l2_regularizer(weight_decay)
        else:
            regularizer = None
            weight = tf.get_variable(name='Weights', shape=shape, dtype=dtype,                                              regularizer=regularizer, trainable=trainable)
        return weight
    

    分为两步:

    1. 生成正则化器

      regularizer= tf.contrib.layers.l2_regularizer(weight_decay)
      
    2. 将正则化器参数传入tf.get_variable,tf.get_variable 会内置计算正则化损失函数,并添加到tf.GraphKeys.REGULARIZATION_LOSSES 集合中

      weight = tf.get_variable(name='Weights', shape=shape, dtype=dtype,                                              regularizer=regularizer, trainable=trainable)
      

第二步:从集合中获取权重损失

​ 有两种方法可以获取集合中的权重损失函数:

  1. 通过tf.get_collection()接口,支持所有的集合遍历,包括内置集合和自定义集合

    weight_loss_op = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
    weight_loss_op = tf.add_n(weight_loss_op)
    
  2. 通过tf.losses.get_regularization_losses()接口,只支持正则化损失收集到REGULARIZATION_LOSSES内置集合的情况

    weight_loss_op = tf.losses.get_regularization_losses()
    weight_loss_op = tf.add_n(weight_loss_op)
    

两种方法都执行两步:

  1. 从特定集合中获取收集的全部权重损失
  2. 使用tf.add_n()接口,遍历并相加所有权重损失项,并返回权重损失之和

第三步:获取总的损失函数

 with tf.variable_scope("loss"):
        cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=logits,                                   labels=input_label_placeholder,name='entropy')
        loss_op = tf.reduce_mean(input_tensor=cross_entropy, name='loss')
        weight_loss_op = tf.losses.get_regularization_losses()
        weight_loss_op = tf.add_n(weight_loss_op)
        total_loss_op = loss_op + weight_loss_op

完整代码示例

​ 下面构建一个完整的包含三层全连接层的网络模型,完成一次迭代训练

完整代码

# @ File       : tf_regularization.py
# @ Description: realize regularization base tensorflow
# @ Author     : Alex Chung
# @ Contact    : yonganzhong@outlook.com

import tensorflow as tf

#+++++++++++++++++++++++++construct 
def fully_connected(input_op, scope, num_outputs, weight_decay=0.00004, is_activation=True, fineturn=True):
    """
     full connect layer
    Args:
        input_op: 
        scope: 
        num_outputs: 
        weight_decay: 
        is_activation: 
        fineturn: 

    Returns:

    """
    # get feature num
    shape = input_op.get_shape().as_list()
    if len(shape) == 4:
        size = shape[-1] * shape[-2] * shape[-3]
    else:
        size = shape[1]
    with tf.compat.v1.variable_scope(scope):
        flat_data = tf.reshape(tensor=input_op, shape=[-1, size], name='Flatten')

        weights =get_weights_1(shape=[size, num_outputs], weight_decay=weight_decay, trainable=fineturn)
        # weights = get_weights_2(shape=[size, num_outputs], weight_decay=weight_decay, trainable=fineturn)
        # weights = get_weights_3(shape=[size, num_outputs], weight_decay=weight_decay, trainable=fineturn)
        biases =get_bias(shape=[num_outputs], trainable=fineturn)

        if is_activation:
             return tf.nn.relu_layer(x=flat_data, weights=weights, biases=biases)
        else:
            return tf.nn.bias_add(value=tf.matmul(flat_data, weights), bias=biases)


def get_bias(shape, trainable=True):
    """
    get bias
    Args:
        shape: 
        trainable: 

    Returns:

    """
    bias = tf.get_variable(shape=shape, name='Bias', dtype=tf.float32, trainable=trainable)

    return bias

def get_weights_1(shape, weight_decay=0.0, dtype=tf.float32, trainable=True):
    """
    add weight regularization to loss collection
    Args:
        shape:
        weight_decay:
        dtype:
        trainable:

    Returns:

    """
    weight = tf.Variable(initial_value=tf.truncated_normal(shape=shape, stddev=0.01), name='Weights', dtype=dtype,
                         trainable=trainable)
    if weight_decay > 0:
        weight_loss = tf.nn.l2_loss(weight) * weight_decay
        # weight_loss = tf.nn.l2_loss(weight, name="weight_loss")
        # tf.add_to_collection(tf.GraphKeys.LOSSES, value=weight_loss)
        tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, value=weight_loss)
    else:
        pass

    return weight


def get_weights_2(shape, weight_decay=0.0, dtype=tf.float32, trainable=True):
    """

    Args:
        shape:
        weight_decay:
        dtype:
        trainable:

    Returns:

    """
    weight = tf.Variable(initial_value=tf.truncated_normal(shape=shape, stddev=0.01), name='Weights', dtype=dtype,
                         trainable=trainable)
    if weight_decay > 0:
        weight_loss = tf.contrib.layers.l2_regularizer(weight_decay)(weight)
        # weight_loss = tf.nn.l2_loss(weight, name="weight_loss")
        # tf.add_to_collection("weight_loss", value=weight_loss)
        tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, value=weight_loss)
    else:
        pass

    return weight


def get_weights_3(shape, weight_decay=0.0, dtype=tf.float32, trainable=True):
    """
    add weight to tf.get_variable
    Args:
        shape:
        weight_decay:
        dtype:
        trainable:
    Returns:

    """
    # create regularizer
    if weight_decay > 0:
        regularizer= tf.contrib.layers.l2_regularizer(weight_decay)
    else:
        regularizer = None
    weight = tf.get_variable(name='Weights', shape=shape, dtype=dtype, regularizer=regularizer,
                              trainable=trainable)
    return weight

#+++++++++++++++++++++++++++++++++consruct network+++++++++++++++++++++++++++++++++++
def model_nets(input_batch, num_classes=None, weight_decay=0.00004, scope="test_nets"):
    """
    full connect network
    Args:
        input_batch: 
        num_classes: 
        weight_decay: 
        scope: 

    Returns:

    """
    with tf.variable_scope(scope):
        net = fully_connected(input_batch, num_outputs=128, weight_decay=weight_decay, scope='fc1')
        net = fully_connected(net, num_outputs=32, weight_decay=weight_decay, scope='fc2')
        net = fully_connected(net, num_outputs=num_classes, is_activation=False, weight_decay=weight_decay, 
                              scope='logits')
        prob = tf.nn.softmax(net, name='prob')
    return prob


#++++++++++++++++++++++++++++++++execute trarin+++++++++++++++++++++++++++++++++
def main():
    
    # parameter config 
    BATCH_SIZE = 10
    DATA_LENGTH = 1024
    NUM_CLASSES = 5
    LEARNING_RATE = 0.001
    WEIGHT_DECAY = 0.00004
    
    # inference part
    global_step = tf.train.get_or_create_global_step()
    input_data_placeholder = tf.placeholder(dtype=tf.float32, shape=[None, DATA_LENGTH], name="input_data")
    input_label_placeholder = tf.placeholder(dtype=tf.float32, shape=[None, NUM_CLASSES], name="input_label")
    # inference part
    logits = model_nets(input_batch=input_data_placeholder, num_classes=NUM_CLASSES, weight_decay=WEIGHT_DECAY)

    # calculate loss part
    with tf.variable_scope("loss"):
        cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=input_label_placeholder,
                                                                name='entropy')
        loss_op = tf.reduce_mean(input_tensor=cross_entropy, name='loss')
        weight_loss_op = tf.losses.get_regularization_losses()
        weight_loss_op = tf.add_n(weight_loss_op)
        total_loss_op = loss_op + weight_loss_op

    # generate data and label
    tf.random.set_random_seed(0)
    data_batch = tf.Variable(tf.random_uniform(shape=(BATCH_SIZE, DATA_LENGTH), minval=0, maxval=1, dtype=tf.float32))
    label_batch = tf.Variable(tf.random_uniform(shape=(BATCH_SIZE,), minval=1, maxval=NUM_CLASSES, dtype=tf.int32))
    label_batch = tf.one_hot(label_batch, depth=NUM_CLASSES) # convert label to onehot
    
    
    # initial variable and graph
    init_op = tf.group(tf.global_variables_initializer(),
                       tf.local_variables_initializer())
    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True
    with tf.Session(config=config) as sess:
        sess.run(init_op)
        input_data, input_label = sess.run([data_batch, label_batch])
        
        print('regularization loss op:')
        for var in tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES):
            print(var.op.name, var.shape)

        # training part
        train_op = tf.train.GradientDescentOptimizer(learning_rate=LEARNING_RATE).minimize(loss=total_loss_op,
                                                                                           global_step=global_step)

        feed_dict = {input_data_placeholder:input_data,
                     input_label_placeholder:input_label}

        _, total_loss, loss, weight_loss = sess.run([train_op, total_loss_op, loss_op, weight_loss_op],
                                                             feed_dict=feed_dict)
        print('loss:{0} weight_loss:{1} total_loss:{2}'.format(loss, weight_loss, total_loss))

 
if __name__ == "__main__":
    main()

执行结果

regularization loss op:
test_nets/fc1/mul ()
test_nets/fc2/mul ()
test_nets/logits/mul ()

loss:1.649293303489685 weight_loss:0.0002091786591336131 total_loss:1.6495025157928467

参考资料

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 213,928评论 6 493
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,192评论 3 387
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 159,468评论 0 349
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,186评论 1 286
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,295评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,374评论 1 292
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,403评论 3 412
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,186评论 0 269
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,610评论 1 306
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,906评论 2 328
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,075评论 1 341
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,755评论 4 337
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,393评论 3 320
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,079评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,313评论 1 267
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,934评论 2 365
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,963评论 2 351