[论文笔记]A guide to convolution arithmetic for deep learning

Github

Chapter 1. Introduction

1.1 Discrete convolutions

N: N-D
n: number of output feature maps
m: number of input feature maps
k_j: kernel size along axis j
i_j: input size along axis j
s_j: stride (distance between two consecutive positions of the kernel) along axis j
p_j: zero padding (number of zeros concatenated at the beginning and at the end of an axis) along axis j

1.2 Pooling

Pooling operations reduce the size of feature maps by using some function to summarize subregions, such as taking the average or the maximum value.

Chapter 2. Convolution arithmetic

The analysis of the relationship between convolutional layer properties is eased by the fact that they don’t interact across axes. Because of that, this chapter will focus on the following simplified setting:

  • 2-D discrete convolutions (N = 2)
  • square inputs (i_1 = i_2 = i),
  • square kernel size (k_1 = k_2 = k),
  • same strides along both axes (s_1 = s_2 = s),
  • same zero padding along both axes (p_1 = p_2 = p).

Note: the results outlined here also generalize to the N-D and non-square cases.

2.1 No zero padding, unit strides (p=0, s=1)

Relationship 1. For any i and k, and for s = 1 and p = 0, o = (i - k) + 1.

2.2 Zero padding, unit strides (p>0, s=1)

Relationship 2. For any i, k and p, and for s = 1,
o = (i - k) + 2p + 1.

2.2.1 Half (same) padding (p=\frac{k-1}{2})

Relationship 3. For any i and for k odd (k = 2n + 1,n\in N), s = 1 and p = n,o= (i+2p)-k+1=(i+2n)-(2n+1)+1=i

2.2.2 Full padding (p=k-1)

Relationship 4. For any i and k, and for p = k - 1 and s = 1, o = i + 2(k - 1) - (k - 1)= i + (k - 1).

2.3 No zero padding, non-unit strides (p=0, s>1)

Relationship 5. For any i, k and s, and for p = 0,
o =\lfloor\frac{i - k}{s}\rfloor+1.

2.4 Zero padding, non-unit strides (p>0, s>1)

Relationship 6. For any i, k, p and s,
o =\lfloor\frac{i+2p-k}{s}\rfloor+1.

Chapter 3. Pooling arithmetic

Pooling does not involve zero padding (p=0).

Relationship 7. For any i, k and s,
o =\lfloor\frac{i - k}{s}\rfloor+1.

Chapter 4. Transposed convolution arithmetic

The need for transposed convolutions generally arises from the desire to use a transformation going in the opposite direction of a normal convolution.
Note: transposed convolution properties don’t interact across axes
We still use the same settings as chapter 2 in the following.

4.1 Convolution as a matrix operation

4.2 Transposed convolution

4.3 No zero padding, unit strides, transposed (p=0, s=1, C^T)

Relationship 8. A convolution described by s = 1, p = 0 and k has an associated transposed convolution described by k' = k, s' = s and p' = k - 1 and its output size is o' =i' + (k - 1):

i\xrightarrow[]{\quad k, s=1, p=0\quad} o=i-k+1
i'=o=i-k+1\xrightarrow[]{\; k'=k, s'=s, p'=k-1\;} o'=i'+2p'-k'+1=i

4.4 Zero padding, unit strides, transposed (p>0, s=1, C^T)

Relationship 9. A convolution described by s = 1, k and p has an associated transposed convolution described by k' = k, s' = s and p' = k - p - 1 and its output size is
o' = i + (k - 1) - 2p

i\xrightarrow[]{\quad k, s=1, p\quad} o=i+2p-k+1
i'=o\xrightarrow[]{\; k'=k, s'=s, p'\;} o'=i'+2p'-k'+1=i\implies p'=k-p-1

4.4.1 Half (same) padding, transposed (p=\frac{k-1}{2}, C^T)

Relationship 10. A convolution described by k = 2n+1,n\in N, s = 1 and p = n has an associated transposed convolution described by k'= k, s'= s and p' = k-p-1=(2p+1)-p-1=p and its output size is o'=i'.

4.4.2 Full padding, transposed (p=k-1, C^T)

Relationship 11. A convolution described by s = 1, k and p =k-1 has an associated transposed convolution described by k' = k, s' = s and p' = k-p-1=0 and its output size is o'=i'-(k-1).

4.5 No zero padding, non-unit strides, transposed (p=0, s>1, C^T)

Relationship 12. A convolution described by p=0, k and s and whose input size is such that i-k is a multiple of s, has an associated transposed convolution described by\tilde{i'},k' = k, s' = 1 and p' = k-1, where \tilde{i'} is the size of the stretched input obtained by adding s-1 zeros between each input unit, and its output size is o'= s(i'-1)+k.

i\xrightarrow[]{\quad k, s, p=0\quad} o=\lfloor\frac{i-k}{s}\rfloor+1
\tilde{i'}=i'+(s-1)(i'-1)=s(i'-1)+1\xrightarrow[]{\; k'=k, s'=1, p'=k-1\;} o'=\lfloor\frac{\tilde{i'}+2p'-k'}{s'}\rfloor+1=s(i'-1)+k

4.6 Zero padding, non-unit strides, transposed (p>0, s>1, C^T)

Relationship 13. A convolution described by p, k and s and whose input size is such that i+2p-k is a multiple of s, has an associated transposed convolution described by\tilde{i'},k' = k, s' = 1 and p' = k-p-1, where \tilde{i'} is the size of the stretched input obtained by adding s-1 zeros between each input unit, and its output size is o'= s(i'-1)+k-2p.

i\xrightarrow[]{\quad k, s, p\quad} o=\lfloor\frac{i+2p-k}{s}\rfloor+1
\tilde{i'}=i'+(s-1)(i'-1)=s(i'-1)+1\xrightarrow[]{\; k'=k, s'=1, p'=k-p-1\;} o'=\lfloor\frac{\tilde{i'}+2p'-k'}{s'}\rfloor+1=s(i'-1)+k-2p

Relationship 14.A convolution described by p, k and s has an associated transposed convolution described bya, \tilde{i'},k' = k, s' = 1 and p' = k-p-1, where \tilde{i'} is the size of the stretched input obtained by adding s-1 zeros between each input unit, and a=i+2p-k mod s represents the number of zeros added to the bottom and right
edges of the input, its output size is o'= s(i'-1)+a+k-2p.

Chapter 5. Miscellaneous convolutions

5.1 Dilated convolutions

Dilated convolutions are used to cheaply increase the receptive field of output units without increasing the kernel size, there are usually d-1 spaces inserted between kernel elements such that d = 1 corresponds to a regular convolution.
A kernel of size k dilated by a factor d has an effective size \hat{k}=k+(k-1)(d-1)

Relationship 15. For any i, k, p and s, and for a dilation rate d, o=\lfloor\frac{i+2p-\hat{k}}{s}\rfloor+1=\lfloor\frac{i+2p-k-(k-1)(d-1)}{s}\rfloor+1.

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • rljs by sennchi Timeline of History Part One The Cognitiv...
    sennchi阅读 12,150评论 0 10
  • 添加方法到相应的菜单栏using UnityEngine;using UnityEditor;public cla...
    AngerCow阅读 3,304评论 0 0
  • 骄阳似火,独自坐在自家阳台上,看着人来人往,车水马龙的街道,回头看着那张熟睡中中依然挂着甜蜜笑容的女孩,又回忆起曾...
    一只小邋遢阅读 2,415评论 0 0
  • 平凡,生活中处处皆是。 亲情、友情平凡不过,但它赐予我们的关爱与帮助,让我们感动,让我们的心灵一次一次地洗礼;家庭...
    禾小沫阅读 536评论 0 0