基于tensorflow的正则化实现

前置条件

什么是正则化(regularization)

如果用一句话解释：正则化就是通过增加权重惩罚(penalty)项到损失函数，让网络倾向于学习小一点的权重，从而达到抑制过拟合，增加模型泛化能力的效果。常见的正则化方法有 $L1$ 正则化， $L2$ 正则化和Dropout正则化等。关于正则化的原理和作用请参考深度学习常用正则化方法

本文以 $L2$ 正则化进行实现，为了完整性，这个给出 $L2$ 正则化的公式
$L = L_0 + \frac{\lambda}{2}\sum_{i=1}^{n}{(w^2)}$
式中 $L_0$ 是原始代价损失函数

$\frac{\lambda}{2}\sum_{i=1}^{n}{(w^2)}$ 是 $L2$ 正则化损失函数，其中 $\lambda$ 是权重因子， $w$ 为权重

tensorflow依赖函数

在tensorflow 中，计算图(graph)通过集合(collection)来管理包括张量(tensor)、变量(variable)、资源

tf.add_to_collection

将资源添加到特定的集合中
tf.get_collection

从特定集合中取出对应的资源

示例

import tensorflow as tf

# step 1 contruct variable
v_0 = tf.Variable(tf.constant([1.0, 2.0, 3.0]), name="v_0")
v_1 = tf.get_variable(shape=(), name="v_1")

# step 2 add variable to collection
tf.add_to_collection(name="variable", value=v_0)
tf.add_to_collection(name="variable", value=v_1)

init_op = tf.group(tf.global_variables_initializer(),
tf.local_variables_initializer())
with tf.Session() as sess:
    sess.run(init_op)
    # step 3 get variable from collection
    for var in tf.get_collection(key="variable"):
        print('{0}: {1}'.format(var.op.name, var.eval()))

结果

v_0: [ 1.  2.  3.]
v_1: -1.4189265966415405

requirement enviroment

software: tensorflow==1.14.0
hardware: GTX 2060

正则化到底做了什么

理论计算

这里假设处理条件：权重 $W$ 为 $[1.0, 2.0, 3.0]$ , 正则化因子 $\lambda$ 为 $0.00004$

根据公式 $L2$ ，正则化损失函数如下

$weight\_loss=\frac{1}{2}*0.00004*(1.0^2+2.0^2+3.0^2)=0.00028$

代码验证

这里使用了三种方式计算 $L2$ 正则化，前两种为tensorflow 接口，其中第一种为底层接口，第二种为更高级的接口，不过在tensorflow 2.0中已经抛弃了；第三种为根据公式自定义实现接口。代码如下

weight_decay = 0.00004  # 正则化权重因子
weight = tf.Variable(initial_value=tf.constant(value=[1.0, 2.0, 3.0])) # 权重

# use tensorflow interface
# method 1
weight_loss_1 = tf.nn.l2_loss(weight) * weight_decay
# method 2
weight_loss_2 = tf.contrib.layers.l2_regularizer(scale=weight_decay)(weight)

# cunstom 
# method 3
custom_weight_loss = tf.reduce_sum(tf.multiply(weight, weight))
custom_weight_loss = 1 / 2 * weight_decay * custom_weight_loss

init_op = tf.group(tf.global_variables_initializer(), 
                   tf.local_variables_initializer())
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
sess.run(init_op)

结果

method 1 result: 0.00028
method 2 result: 0.00028
method 3 result: 0.00028

可以看出运行结果于理论推导一致，接下来就是如何在网络中加入 $L2$ 正则化，并完成训练。

在网络中引入L2正则化

上述内容已经介绍了 $L2$ 的基本概念和使用，接下来将介绍如何在神经网络的构建中引入 $L2$ 正则化。 $L2$ 正则化在神经网络中的使用主要包括三个步骤：

计算权重的 $L2$ 损失并添加到集合(collection)中
分别取出集合中所有权重的 $L2$ 损失值并相加
$L2$ 正则化损失函数与原始代价损失函数相加得到总的损失函数

第一步：三种方式收集权重损失函数

使用f.nn.l2_loss()接口与自定义collection 接口

def get_weights_1(shape, weight_decay=0.0, dtype=tf.float32, trainable=True):
    """
    add weight regularization to loss collection
    Args:
        shape: 
        weight_decay: 
        dtype: 
        trainable: 

    Returns:
    """
    weight = tf.Variable(initial_value=tf.truncated_normal(shape=shape, stddev=0.01),                          name='Weights', dtype=dtype, trainable=trainable)
    if weight_decay > 0:
        weight_loss = tf.nn.l2_loss(weight) * weight_decay
        # weight_loss = tf.nn.l2_loss(weight, name="weight_loss")
        # tf.add_to_collection(tf.GraphKeys.LOSSES, value=weight_loss)
        tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, value=weight_loss)
    else:
        pass
    return weight

分为两步：

计算正则化损失

weight_loss = tf.nn.l2_loss(weight) * weight_decay

将正则化损失添加到特定集合中(这里直接添加到tensorflow内置集合，也可以添加到自定义集合)
```
tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, value=weight_loss)
```

使用 tf.contrib.layers.l2_regularizer 与自定义collection 接口

def get_weights_2(shape, weight_decay=0.0, dtype=tf.float32, trainable=True):
    """

    Args:
        shape:
        weight_decay:
        dtype:
        trainable:
    Returns:

    """
    weight = tf.Variable(initial_value=tf.truncated_normal(shape=shape, stddev=0.01),                          name='Weights', dtype=dtype, trainable=trainable)
    if weight_decay > 0:
        weight_loss = tf.contrib.layers.l2_regularizer(weight_decay)(weight)
        # weight_loss = tf.nn.l2_loss(weight, name="weight_loss")
        # tf.add_to_collection("weight_loss", value=weight_loss)
        tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, value=weight_loss)
    else:
        pass
    return weight

分为两步：

计算正则化损失

weight_loss = tf.contrib.layers.l2_regularizer(weight_decay)(weight)

将正则化损失添加到特定集合中(这里直接添加到tensorflow内置集合，也可以添加到自定义集合)
```
tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, value=weight_loss)
```

使用tf.contrib.layers.l2_regularizer 与 tf.get_variable接口

def get_weights_3(shape, weight_decay=0.0, dtype=tf.float32, trainable=True):
    """
    add weight to tf.get_variable
    Args:
        shape:
        weight_decay:
        dtype:
        trainable:
    Returns:

    """
    # create regularizer
    if weight_decay > 0:
        regularizer= tf.contrib.layers.l2_regularizer(weight_decay)
    else:
        regularizer = None
        weight = tf.get_variable(name='Weights', shape=shape, dtype=dtype,                                              regularizer=regularizer, trainable=trainable)
    return weight

分为两步：

生成正则化器

regularizer= tf.contrib.layers.l2_regularizer(weight_decay)

将正则化器参数传入tf.get_variable，tf.get_variable 会内置计算正则化损失函数，并添加到tf.GraphKeys.REGULARIZATION_LOSSES 集合中

weight = tf.get_variable(name='Weights', shape=shape, dtype=dtype,                                              regularizer=regularizer, trainable=trainable)

第二步：从集合中获取权重损失

有两种方法可以获取集合中的权重损失函数：

通过tf.get_collection()接口，支持所有的集合遍历，包括内置集合和自定义集合

weight_loss_op = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
weight_loss_op = tf.add_n(weight_loss_op)

通过tf.losses.get_regularization_losses()接口，只支持正则化损失收集到REGULARIZATION_LOSSES内置集合的情况
```
weight_loss_op = tf.losses.get_regularization_losses()
weight_loss_op = tf.add_n(weight_loss_op)
```

两种方法都执行两步：

从特定集合中获取收集的全部权重损失
使用tf.add_n()接口，遍历并相加所有权重损失项，并返回权重损失之和

第三步：获取总的损失函数

 with tf.variable_scope("loss"):
        cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=logits,                                   labels=input_label_placeholder,name='entropy')
        loss_op = tf.reduce_mean(input_tensor=cross_entropy, name='loss')
        weight_loss_op = tf.losses.get_regularization_losses()
        weight_loss_op = tf.add_n(weight_loss_op)
        total_loss_op = loss_op + weight_loss_op

完整代码示例

下面构建一个完整的包含三层全连接层的网络模型，完成一次迭代训练

完整代码

# @ File       : tf_regularization.py
# @ Description: realize regularization base tensorflow
# @ Author     : Alex Chung
# @ Contact    : yonganzhong@outlook.com

import tensorflow as tf

#+++++++++++++++++++++++++construct 
def fully_connected(input_op, scope, num_outputs, weight_decay=0.00004, is_activation=True, fineturn=True):
    """
     full connect layer
    Args:
        input_op: 
        scope: 
        num_outputs: 
        weight_decay: 
        is_activation: 
        fineturn: 

    Returns:

    """
    # get feature num
    shape = input_op.get_shape().as_list()
    if len(shape) == 4:
        size = shape[-1] * shape[-2] * shape[-3]
    else:
        size = shape[1]
    with tf.compat.v1.variable_scope(scope):
        flat_data = tf.reshape(tensor=input_op, shape=[-1, size], name='Flatten')

        weights =get_weights_1(shape=[size, num_outputs], weight_decay=weight_decay, trainable=fineturn)
        # weights = get_weights_2(shape=[size, num_outputs], weight_decay=weight_decay, trainable=fineturn)
        # weights = get_weights_3(shape=[size, num_outputs], weight_decay=weight_decay, trainable=fineturn)
        biases =get_bias(shape=[num_outputs], trainable=fineturn)

        if is_activation:
             return tf.nn.relu_layer(x=flat_data, weights=weights, biases=biases)
        else:
            return tf.nn.bias_add(value=tf.matmul(flat_data, weights), bias=biases)


def get_bias(shape, trainable=True):
    """
    get bias
    Args:
        shape: 
        trainable: 

    Returns:

    """
    bias = tf.get_variable(shape=shape, name='Bias', dtype=tf.float32, trainable=trainable)

    return bias

def get_weights_1(shape, weight_decay=0.0, dtype=tf.float32, trainable=True):
    """
    add weight regularization to loss collection
    Args:
        shape:
        weight_decay:
        dtype:
        trainable:

    Returns:

    """
    weight = tf.Variable(initial_value=tf.truncated_normal(shape=shape, stddev=0.01), name='Weights', dtype=dtype,
                         trainable=trainable)
    if weight_decay > 0:
        weight_loss = tf.nn.l2_loss(weight) * weight_decay
        # weight_loss = tf.nn.l2_loss(weight, name="weight_loss")
        # tf.add_to_collection(tf.GraphKeys.LOSSES, value=weight_loss)
        tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, value=weight_loss)
    else:
        pass

    return weight


def get_weights_2(shape, weight_decay=0.0, dtype=tf.float32, trainable=True):
    """

    Args:
        shape:
        weight_decay:
        dtype:
        trainable:

    Returns:

    """
    weight = tf.Variable(initial_value=tf.truncated_normal(shape=shape, stddev=0.01), name='Weights', dtype=dtype,
                         trainable=trainable)
    if weight_decay > 0:
        weight_loss = tf.contrib.layers.l2_regularizer(weight_decay)(weight)
        # weight_loss = tf.nn.l2_loss(weight, name="weight_loss")
        # tf.add_to_collection("weight_loss", value=weight_loss)
        tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, value=weight_loss)
    else:
        pass

    return weight


def get_weights_3(shape, weight_decay=0.0, dtype=tf.float32, trainable=True):
    """
    add weight to tf.get_variable
    Args:
        shape:
        weight_decay:
        dtype:
        trainable:
    Returns:

    """
    # create regularizer
    if weight_decay > 0:
        regularizer= tf.contrib.layers.l2_regularizer(weight_decay)
    else:
        regularizer = None
    weight = tf.get_variable(name='Weights', shape=shape, dtype=dtype, regularizer=regularizer,
                              trainable=trainable)
    return weight

#+++++++++++++++++++++++++++++++++consruct network+++++++++++++++++++++++++++++++++++
def model_nets(input_batch, num_classes=None, weight_decay=0.00004, scope="test_nets"):
    """
    full connect network
    Args:
        input_batch: 
        num_classes: 
        weight_decay: 
        scope: 

    Returns:

    """
    with tf.variable_scope(scope):
        net = fully_connected(input_batch, num_outputs=128, weight_decay=weight_decay, scope='fc1')
        net = fully_connected(net, num_outputs=32, weight_decay=weight_decay, scope='fc2')
        net = fully_connected(net, num_outputs=num_classes, is_activation=False, weight_decay=weight_decay, 
                              scope='logits')
        prob = tf.nn.softmax(net, name='prob')
    return prob


#++++++++++++++++++++++++++++++++execute trarin+++++++++++++++++++++++++++++++++
def main():
    
    # parameter config 
    BATCH_SIZE = 10
    DATA_LENGTH = 1024
    NUM_CLASSES = 5
    LEARNING_RATE = 0.001
    WEIGHT_DECAY = 0.00004
    
    # inference part
    global_step = tf.train.get_or_create_global_step()
    input_data_placeholder = tf.placeholder(dtype=tf.float32, shape=[None, DATA_LENGTH], name="input_data")
    input_label_placeholder = tf.placeholder(dtype=tf.float32, shape=[None, NUM_CLASSES], name="input_label")
    # inference part
    logits = model_nets(input_batch=input_data_placeholder, num_classes=NUM_CLASSES, weight_decay=WEIGHT_DECAY)

    # calculate loss part
    with tf.variable_scope("loss"):
        cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=input_label_placeholder,
                                                                name='entropy')
        loss_op = tf.reduce_mean(input_tensor=cross_entropy, name='loss')
        weight_loss_op = tf.losses.get_regularization_losses()
        weight_loss_op = tf.add_n(weight_loss_op)
        total_loss_op = loss_op + weight_loss_op

    # generate data and label
    tf.random.set_random_seed(0)
    data_batch = tf.Variable(tf.random_uniform(shape=(BATCH_SIZE, DATA_LENGTH), minval=0, maxval=1, dtype=tf.float32))
    label_batch = tf.Variable(tf.random_uniform(shape=(BATCH_SIZE,), minval=1, maxval=NUM_CLASSES, dtype=tf.int32))
    label_batch = tf.one_hot(label_batch, depth=NUM_CLASSES) # convert label to onehot
    
    
    # initial variable and graph
    init_op = tf.group(tf.global_variables_initializer(),
                       tf.local_variables_initializer())
    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True
    with tf.Session(config=config) as sess:
        sess.run(init_op)
        input_data, input_label = sess.run([data_batch, label_batch])
        
        print('regularization loss op:')
        for var in tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES):
            print(var.op.name, var.shape)

        # training part
        train_op = tf.train.GradientDescentOptimizer(learning_rate=LEARNING_RATE).minimize(loss=total_loss_op,
                                                                                           global_step=global_step)

        feed_dict = {input_data_placeholder:input_data,
                     input_label_placeholder:input_label}

        _, total_loss, loss, weight_loss = sess.run([train_op, total_loss_op, loss_op, weight_loss_op],
                                                             feed_dict=feed_dict)
        print('loss:{0} weight_loss:{1} total_loss:{2}'.format(loss, weight_loss, total_loss))

 
if __name__ == "__main__":
    main()

执行结果

regularization loss op:
test_nets/fc1/mul ()
test_nets/fc2/mul ()
test_nets/logits/mul ()

loss:1.649293303489685 weight_loss:0.0002091786591336131 total_loss:1.6495025157928467

基于tensorflow的正则化实现

基于tensorflow的正则化实现

前置条件

什么是正则化(regularization)

tensorflow依赖函数

requirement enviroment

正则化到底做了什么

理论计算

代码验证

结果

在网络中引入L2正则化

第一步：三种方式收集权重损失函数

第二步：从集合中获取权重损失

第三步：获取总的损失函数

完整代码示例

完整代码

执行结果

参考资料