卷积层的后向传播实例(stride=2)

这里以一个实例介绍一下卷积层的后向传播, 当stride 等于2的情况下,求输入,权重和偏置的梯度。假设输入的特征图X是4x4的矩阵,卷积核W是2x2大小, 考虑“valid”的卷积,则输出特征图Y为2x2的矩阵 :

\left[ \begin{matrix} x_{11} & x_{12} & x_{13} & x_{14} \\ x_{21} & x_{22} & x_{23} & x_{24} \\ x_{31} & x_{32} & x_{33} & x_{34} \\ x_{41} & x_{42} & x_{43} & x_{44} \end{matrix} \right]* \left[ \begin{matrix} w_{11} & w_{12} \\ w_{21} & w_{22} \end{matrix} \right]= \left[ \begin{matrix} y_{11} & y_{12} \\ y_{21} & y_{22} \end{matrix} \right]

输出Y的表达式可以表示为:
y_{11} = w_{11} x_{11} + w_{12} x_{12} + w_{21} x_{21} + w_{22} x_{22} + b
y_{12} = w_{11} x_{13} + w_{12} x_{14} + w_{21} x_{23} + w_{22} x_{24} + b
y_{21} = w_{11} x_{31} + w_{12} x_{32} + w_{21} x_{41} + w_{22} x_{42} + b
y_{22} = w_{11} x_{33} + w_{12} x_{34} + w_{21} x_{43} + w_{22} x_{44} + b

假设代价函数对输出Y的梯度为, \partial Y, 其为2X2的矩阵,如下:
\left[ \begin{matrix} \partial y_{11} & \partial y_{12} \\ \partial y_{21} & \partial y_{22} \end{matrix} \right ]

然后分别来看输入,权重和偏置的梯度的求解过程。

输入的梯度

从Y的表达式求X的偏导:
\partial x_{11} = \partial y_{11} w_{11}
\partial x_{12} = \partial y_{11} w_{12}
\partial x_{13} = \partial y_{12} w_{11}
\partial x_{14} = \partial y_{12} w_{12}
\partial x_{21} = \partial y_{11} w_{21}
\partial x_{22} = \partial y_{11} w_{22}
\partial x_{23} = \partial y_{12} w_{21}
\partial x_{24} = \partial y_{12} w_{22}
\partial x_{31} = \partial y_{21} w_{11}
\partial x_{32} = \partial y_{21} w_{12}
\partial x_{33} = \partial y_{22} w_{11}
\partial x_{34} = \partial y_{22} w_{12}
\partial x_{41} = \partial y_{21} w_{21}
\partial x_{42} = \partial y_{21} w_{22}
\partial x_{43} = \partial y_{22} w_{21}
\partial x_{44} = \partial y_{22} w_{22}

用矩阵可以表示如下:
\partial X = \left[ \begin{matrix} \partial x_{11} & \partial x_{12} & \partial x_{13} & \partial x_{14}\\ \partial x_{21} & \partial x_{22} & \partial x_{23} & \partial x_{24}\\ \partial x_{31} & \partial x_{32} & \partial x_{33} & \partial x_{34}\\ \partial x_{41} & \partial x_{42} & \partial x_{43} & \partial x_{44} \end{matrix} \right]= \left[ \begin{matrix} 0 & 0 & 0 & 0 & 0 \\ 0 & \partial y_{11} & 0 & \partial y_{12} & 0\\ 0 & 0 & 0 & 0 & 0\\ 0 & \partial y_{21} & 0 & \partial y_{22} & 0\\ 0 & 0 & 0 & 0 & 0 \end{matrix} \right]* \left[ \begin{matrix} w_{22} & w_{21} \\ w_{12} & w_{11} \end{matrix} \right]
简单的可以表示如下:
\partial X = (padding)\partial Y * rot180(W)
前向stride等于2, 反向计算\partial X的时候,需要对\partial Y 加internal padding, 这个是与stride等于1的时候是非常不一样的。

权重的梯度

从Y的表达式求W的偏导, 可以得到:
\partial w_{11} = \partial y_{11} x_{11} + \partial y_{12} x_{13} + \partial y_{21} x_{31} + \partial y_{22} x_{33}
\partial w_{12} = \partial y_{11} x_{12} + \partial y_{12} x_{14} + \partial y_{21} x_{32} + \partial y_{22} x_{34}
\partial w_{21} = \partial y_{11} x_{21} + \partial y_{12} x_{23} + \partial y_{21} x_{41} + \partial y_{22} x_{43}
\partial w_{22} = \partial y_{11} x_{22} + \partial y_{12} x_{24} + \partial y_{21} x_{42} + \partial y_{22} x_{44}

用矩阵可以表示如下:
\partial W = \left[ \begin{matrix} \partial w_{11} & \partial w_{12} \\ \partial w_{21} & \partial w_{22} \end{matrix} \right]=(dilation\_conv) \left[ \begin{matrix} x_{11} & x_{12} & x_{13} & x_{14} \\ x_{21} & x_{22} & x_{23} & x_{24} \\ x_{31} & x_{32} & x_{33} & x_{34} \\ x_{41} & x_{42} & x_{43} & x_{44} \end{matrix} \right] * \left[ \begin{matrix} \partial y_{11} & \partial y_{12} \\ \partial y_{21} & \partial y_{22} \end{matrix} \right]
简单的可以表示如下:
\partial W = dilation\_conv(X * \partial Y)
可见前向stride等于2, 反向计算\partial W的时候,与前向stride等于1的时候是不一样的,计算\partial W的时候, 是做dilation convolution,而不是普通的convolution.

偏置的梯度

从Y的表达式求b的偏导, 可以得到:
\partial b = \partial y_{11} + \partial y_{12} + \partial y_{21} + \partial y_{22}

简单的可以表示如下:
\partial b = \Sigma_{u,v} \partial Y
偏置的梯度的计算和stride等于1的时候是一样的。

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

友情链接更多精彩内容