互信息=公平+明确

令 $\mathbf {x}$ 表示输入数据， $c$ 表示类别标签， $\mathbf { y} =p ( c \mid \mathbf { x } )$ 表示模型输出，输入与其类别间的互信息为：
$\begin{aligned} \mathcal { I } ( c ; \mathbf { x } ) & = \iint d c d \mathbf { x } p ( c , \mathbf { x } ) \log \frac { p ( c , \mathbf { x } ) } { p ( c ) p ( \mathbf { x } ) } \\ & = \int d \mathbf { x } p ( \mathbf { x } ) \int d c p ( c \mid \mathbf { x } ) \log \frac { p ( c \mid \mathbf { x } ) } { p ( c ) } \\ & = \int d \mathbf { x } p ( \mathbf { x } ) \int d c p ( c \mid \mathbf { x } ) \log \frac { p ( c \mid \mathbf { x } ) } { \int d \mathbf { x } p ( \mathbf { x } ) p ( c \mid \mathbf { x } ) } \end{aligned}$
上面的公式分项解释如下：

$\int d \mathbf {x} p ( \mathbf { x } )(\cdot)$ 表示对于输入取期望，在训练集中等价于 $\frac { 1 } { N _ { t s } } \sum _ { t s } ( \cdot )$ ，其中 $N _ { t s }$ 表示训练集的数据量；
$p(c \mid \mathbf {x} )$ 表示模型输出， $\int dc p(c \mid \mathbf { x } ) (\cdot)$ 表示各类输出的均值，在实际训练中体现为 $\sum _ { i = 1 } ^ { N _ { c } } ( \cdot )$ ；

于是上式在训练时可表示为：
$\begin{aligned} \mathcal {I } ( c ; \mathbf { x } ) &= \frac { 1 } { N _ { t s } } \sum _ { t s } \sum _ { i = 1 } ^ { N _ { c } } y _ { i } \log \frac { y _ { i } } { \overline{y_ { i }} } \\ &= \sum _ { i = 1 } ^ { N _ { c } }( -\overline{y _{ i } } \log \overline{ y } _ { i })+\frac { 1 } { N _ { t s } } \sum _ { t s } \sum _ { i = 1 } ^ { N _ { c } } y _ { i } \log y _ { i } \newline &= \mathcal { H } ( \overline { \mathbf { y } } ) - \overline { \mathcal { H } ( \mathbf { y } ) } \end{aligned}$

$\mathcal{H} ( \overline { \mathbf { y } } )$ 表示类别分布均值的熵，体现分类器的“平均性”或“公平性”（fairness）；
$\overline{ \mathcal{H} ( \mathbf { y } ) }$ 表示数据样本类别的熵的平均值，最小化该值体现分类器的“明确性”或“坚定性”（firmness）。

参考文献：
Bridle J, Heading A, MacKay D. Unsupervised classifiers, mutual information and'Phantom targets[J]. Advances in neural information processing systems, 1991, 4.

互信息=公平+明确

推荐阅读更多精彩内容