卷积层的后向传播实例(stride=1)

这里以一个实例介绍一下卷积层的后向传播, 求输入,权重和偏置的梯度。假设输入的特征图X是4x4的矩阵,卷积核W是2x2大小, stride=1,考虑“valid”的卷积,则输出特征图Y为3x3的矩阵 :
\left[ \begin{matrix} x_{11} & x_{12} & x_{13} & x_{14} \\ x_{21} & x_{22} & x_{23} & x_{24} \\ x_{31} & x_{32} & x_{33} & x_{34} \\ x_{41} & x_{42} & x_{43} & x_{44} \end{matrix} \right]* \left[ \begin{matrix} w_{11} & w_{12} \\ w_{21} & w_{22} \end{matrix} \right]= \left[ \begin{matrix} y_{11} & y_{12} & y_{13} \\ y_{21} & y_{22} & y_{23} \\ y_{31} & y_{32} & y_{33} \end{matrix} \right]

输出Y的表达式可以表示为:
y_{11} = w_{11} x_{11} + w_{12} x_{12} + w_{21} x_{21} + w_{22} x_{22} + b
y_{12} = w_{11} x_{12} + w_{12} x_{13} + w_{21} x_{22} + w_{22} x_{23} + b
y_{13} = w_{11} x_{13} + w_{12} x_{14} + w_{21} x_{23} + w_{22} x_{24} + b
y_{21} = w_{11} x_{21} + w_{12} x_{22} + w_{21} x_{31} + w_{22} x_{32} + b
y_{22} = w_{11} x_{22} + w_{12} x_{23} + w_{21} x_{32} + w_{22} x_{33} + b
y_{23} = w_{11} x_{23} + w_{12} x_{24} + w_{21} x_{33} + w_{22} x_{34} + b
y_{31} = w_{11} x_{31} + w_{12} x_{32} + w_{21} x_{41} + w_{22} x_{42} + b
y_{32} = w_{11} x_{32} + w_{12} x_{33} + w_{21} x_{42} + w_{22} x_{43} + b
y_{33} = w_{11} x_{33} + w_{12} x_{34} + w_{21} x_{43} + w_{22} x_{44} + b

假设代价函数对输出Y的梯度为, \partial Y, 其为3X3的矩阵,如下:
\left[ \begin{matrix} \partial y_{11} & \partial y_{12} & \partial y_{13}\\ \partial y_{21} & \partial y_{22} & \partial y_{23}\\ \partial y_{31} & \partial y_{32} & \partial y_{33} \end{matrix} \right]

然后分别来看输入,权重和偏置的梯度的求解过程。

输入的梯度

从Y的表达式求X的偏导:
\partial x_{11} = \partial y_{11} w_{11}
\partial x_{12} = \partial y_{11} w_{12} + \partial y_{12} w_{11}
\partial x_{13} = \partial y_{12} w_{12} + \partial y_{13} w_{11}
\partial x_{14} = \partial y_{13} w_{12}
\partial x_{21} = \partial y_{11} w_{21} + \partial y_{21} w_{11}
\partial x_{22} = \partial y_{11} w_{22} + \partial y_{12} w_{21} + \partial y_{21} w_{12} + \partial y_{22} w_{11}
\partial x_{23} = \partial y_{12} w_{22} + \partial y_{13} w_{21} + \partial y_{22} w_{12} + \partial y_{23} w_{11}
\partial x_{24} = \partial y_{13} w_{22} + \partial y_{23} w_{12}
\partial x_{31} = \partial y_{21} w_{21} + \partial y_{31} w_{11}
\partial x_{32} = \partial y_{21} w_{22} + \partial y_{22} w_{21} + \partial y_{31} w_{12} + \partial y_{32} w_{11}
\partial x_{33} = \partial y_{22} w_{22} + \partial y_{23} w_{21} + \partial y_{32} w_{12} + \partial y_{33} w_{11}
\partial x_{34} = \partial y_{23} w_{22} + \partial y_{33} w_{12}
\partial x_{41} = \partial y_{31} w_{21}
\partial x_{42} = \partial y_{31} w_{22} + \partial y_{32} w_{21}
\partial x_{43} = \partial y_{32} w_{22} + \partial y_{33} w_{21}
\partial x_{44} = \partial y_{33} w_{22}

用矩阵可以表示如下:
\partial X = \left[ \begin{matrix} \partial x_{11} & \partial x_{12} & \partial x_{13} & \partial x_{14}\\ \partial x_{21} & \partial x_{22} & \partial x_{23} & \partial x_{24}\\ \partial x_{31} & \partial x_{32} & \partial x_{33} & \partial x_{34}\\ \partial x_{41} & \partial x_{42} & \partial x_{43} & \partial x_{44} \end{matrix} \right]= \left[ \begin{matrix} 0 & 0 & 0 & 0 & 0 \\ 0 & \partial y_{11} & \partial y_{12} & \partial y_{13} & 0\\ 0 & \partial y_{21} & \partial y_{22} & \partial y_{23} & 0\\ 0 & \partial y_{31} & \partial y_{32} & \partial y_{33} & 0\\ 0 & 0 & 0 & 0 & 0 \end{matrix} \right]* \left[ \begin{matrix} w_{22} & w_{21} \\ w_{12} & w_{11} \end{matrix} \right]
简单的可以表示如下:
\partial X = (padding)\partial Y * rot180(W)

权重的梯度

从Y的表达式求W的偏导, 可以得到:
\partial w_{11} = \partial y_{11} x_{11} + \partial y_{12} x_{12} + \partial y_{13} x_{13} + \partial y_{21} x_{21} + \partial y_{22} x_{22} + \partial y_{23} x_{23} + \partial y_{31} x_{31} + \partial y_{32} x_{32} + \partial y_{33} x_{33}
\partial w_{12} = \partial y_{11} x_{12} + \partial y_{12} x_{13} + \partial y_{13} x_{14} + \partial y_{21} x_{22} + \partial y_{22} x_{23} + \partial y_{23} x_{24} + \partial y_{31} x_{32} + \partial y_{32} x_{33} + \partial y_{33} x_{34}
\partial w_{21} = \partial y_{11} x_{21} + \partial y_{12} x_{22} + \partial y_{13} x_{23} + \partial y_{21} x_{31} + \partial y_{22} x_{32} + \partial y_{23} x_{33} + \partial y_{31} x_{41} + \partial y_{32} x_{42} + \partial y_{33} x_{43}
\partial w_{22} = \partial y_{11} x_{22} + \partial y_{12} x_{23} + \partial y_{13} x_{24} + \partial y_{21} x_{32} + \partial y_{22} x_{33} + \partial y_{23} x_{34} + \partial y_{31} x_{42} + \partial y_{32} x_{43} + \partial y_{33} x_{44}

用矩阵可以表示如下:
\partial W = \left[ \begin{matrix} x_{11} & x_{12} & x_{13} & x_{14} \\ x_{21} & x_{22} & x_{23} & x_{24} \\ x_{31} & x_{32} & x_{33} & x_{34} \\ x_{41} & x_{42} & x_{43} & x_{44} \end{matrix} \right]* \left[ \begin{matrix} \partial y_{11} & \partial y_{12} & \partial y_{13} \\ \partial y_{21} & \partial y_{22} & \partial y_{23} \\ \partial y_{31} & \partial y_{32} & \partial y_{33} \end{matrix} \right]
简单的可以表示如下:
\partial W = X * \partial Y

偏置的梯度

从Y的表达式求b的偏导, 可以得到:
\partial b = \partial y_{11} + \partial y_{12} + \partial y_{13} + \partial y_{21} + \partial y_{22} + \partial y_{23} + \partial y_{31} + \partial y_{32} + \partial y_{33}

简单的可以表示如下:
\partial b = \Sigma_{u,v} \partial Y

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容