Norm举例

# 创建一个输入张量,其中有两个句子(即 batch_size=2),每个句子有4个词(即 seq_len=4),并且每个词的词向量维度是3(即 dim=3)。这样,我们的输入张量的形状就是 (2,4,3)
X = np.array([
    [[0.1, 0.2, 0.3],
     [1.1, 1.2, 1.3],
     [2.1, 2.2, 2.3],
     [3.1, 3.2, 3.3]],
    
    [[4.1, 4.2, 4.3],
     [5.1, 5.2, 5.3],
     [6.1, 6.2, 6.3],
     [7.1, 7.2, 7.3]]
])

#batch_norm ,会发现一句话里的每个token都一样了,虽然是例子有点特殊,但是也说明了batch_norm不适合NLP
array([[[-1.04912609, -0.99916771, -0.94920932],
        [-1.04912609, -0.99916771, -0.94920932],
        [-1.04912609, -0.99916771, -0.94920932],
        [-1.04912609, -0.99916771, -0.94920932]],

       [[ 0.94920932,  0.99916771,  1.04912609],
        [ 0.94920932,  0.99916771,  1.04912609],
        [ 0.94920932,  0.99916771,  1.04912609],
        [ 0.94920932,  0.99916771,  1.04912609]]])

#layer_norm 
 array([[[-1.42728248, -1.33807733, -1.24887217],
         [-0.53523093, -0.44602578, -0.35682062],
         [ 0.35682062,  0.44602578,  0.53523093],
         [ 1.24887217,  1.33807733,  1.42728248]],
 
        [[-1.42728248, -1.33807733, -1.24887217],
         [-0.53523093, -0.44602578, -0.35682062],
         [ 0.35682062,  0.44602578,  0.53523093],
         [ 1.24887217,  1.33807733,  1.42728248]]]),

#instance_norm(同样每个token都一样,完全忽略了上下文的关系)
 array([[[-1.22474487e+00,  0.00000000e+00,  1.22474487e+00],
         [-1.22474487e+00,  0.00000000e+00,  1.22474487e+00],
         [-1.22474487e+00,  0.00000000e+00,  1.22474487e+00],
         [-1.22474487e+00,  0.00000000e+00,  1.22474487e+00]],
 
        [[-1.22474487e+00,  0.00000000e+00,  1.22474487e+00],
         [-1.22474487e+00,  0.00000000e+00,  1.22474487e+00],
         [-1.22474487e+00,  0.00000000e+00,  1.22474487e+00],
         [-1.22474487e+00,  0.00000000e+00,  1.22474487e+00]]]),

#group_norm(如果n=2,也就是每两个token一起算)
 array([[[-1.18431305, -0.98692754, -0.78954203],
         [ 0.78954203,  0.98692754,  1.18431305],
         [-1.18431305, -0.98692754, -0.78954203],
         [ 0.78954203,  0.98692754,  1.18431305]],
 
        [[-1.18431305, -0.98692754, -0.78954203],
         [ 0.78954203,  0.98692754,  1.18431305],
         [-1.18431305, -0.98692754, -0.78954203],
         [ 0.78954203,  0.98692754,  1.18431305]]]))
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容