2.3.8 周期变量

对于某些连续变量（比如周期变量），使用高斯分布建模并不合适。

为什么不合适？

考虑观测数据集 $D=\{\theta_1,\theta_2,...,\theta_N\}$ 的均值问题，首先假设 $\theta$ 为弧度，很明显平均值 $\frac{\theta_1+\theta_2+...+\theta_N}{N}$ 强烈依赖坐标系（原点）的选取。为了找一个均值不变的度量，我们将观测当做单位圆上的点，这样就可以被描述成一个二维的单元向量 $x_1,x_2...x_n$ （取值区间在-1到1），求平均也就是
$\hat x = \frac{1}{N}\sum^N_{n=1}x_n$
根据该均值 $\hat x$ 找到对应的角度 $\hat \theta$ ，这个定义将会保证均值的位置与极坐标原点的选择无关（？？？）。另外 $\hat x$ 通常位于单位圆的内部。在笛卡尔坐标系下 $x_n=(cos\theta_n, sin\theta_n)$ ，因此样本均值的笛卡尔坐标 $\hat x = (\hat r cos\hat \theta, \hat r sin\hat \theta)$ ，带入上面的均值公式可得：
$\hat x_1 = \hat r cos\hat \theta=\frac{1}{N}cos\theta_n, \hat x_2 = \hat r sin\hat \theta=\frac{1}{N}sin\theta_n,$
使用公式 $tan\theta = \frac{sin\theta}{cos\theta}$ 可以求出 $\hat \theta$ ，也就是观测数据集 $D=\{\theta_1,\theta_2,...,\theta_N\}$ 的均值。
$\hat \theta = tan^{-1}\{\frac{\sum_n sin\theta_n}{\sum_n cos\theta_n}\}$

现在我们考虑高斯分布对于周期变量的一个推广: von Mises 分布。
按照惯例我们考虑周期函数分布 $p(\theta)$ 的周期为 $2\pi$ ，满足下面三个条件：
$p(\theta)>0\\ \int^{2\pi}_0p(\theta)d\theta = 1\\ p(\theta+2\pi)=p(\theta)$
我们可以很容易地得到一个类似高斯的分布,满足这三个性质：两个变量 $x=(x_1,x_2)$ 的高斯分布,均值为 $\mu=(\mu_1,\mu_2)$ ,协方差矩阵为 $\Sigma=\sigma^2 I$ ,其中 $I$ 是个 $2×2$ 的单位矩阵。因此
$p(x_1,x_2)=\frac{1}{2\pi\sigma^2}exp\{-\frac{(x_1-\mu_1)^2+(x_2-\mu_2)^2}{2\sigma^2}\}$
概率 $p(x)$ 为常数的轮廓线是圆形,如图所示。

该分布沿着一个固定半径的圆周的值。如果我们将这个分布的形式从笛卡尔坐标系转换到极坐标系，这样的分布具有周期性。

x_1 = r cos\theta, x_2 = r sin\theta\\ \mu_1 = r_0 cos\theta_0, \mu_2 = r_0 sin\theta_0

替换到高斯分布的公式里面，考察指数项（只有指数项与

\theta

相关）：

-\frac{1}{2\sigma^2}\{(rcos\theta-r_0cos\theta_0)^2+(rsin\theta-r_0sin\theta)^2\}\\ =-\frac{1}{2\sigma^2}\{1+r_0^2-2r_0cos\theta cos\theta_0-2r_0sin\theta sin\theta_0\}\\ =\frac{r_0}{\sigma}cos(\theta-\theta_0)+C

定义

m = \frac{r_0}{\sigma^2}

，我们得到在单位圆

r=1

上的概率分布

p(\theta)

的表达式：

p(\theta|\theta_0,m)=\frac{1}{2\pi I_0(m)}\exp\{mcos(\theta-\theta_0)\}

这被称为von Mises分布，或者环形正态分布( circular normal )。其中

\theta_0

对应分布的均值，m被称为concentration参数，类似于高斯分布的精度，归一化系数

I_0(m)

是零阶修正的第一类 Bessel 函数（？？？）

I_0(m) = \frac{1}{2}\int^{2\pi}_0\exp\{mcos{2\theta}\}d\theta

对于大的m值，分布逼近高斯分布。

现在考虑von Mises 分布中参数 $m$ 和 $\theta_0$ 的最大似然估计，对数似然函数：
$\ln p(D|\theta_0, m) = -N\ln(2\pi)-N\ln(I_0(m))+m\sum^N_{n=1}cos(\theta_n-\theta_0)$

求 $\theta$
令关于 $\theta_0$ 的导数为0：
$\sum^N_{n=1}sin(\theta_n-\theta_0)=0$
使用三角恒等式
$sin(A-B)=cosBsinA-cosAsinB$
我们有
$\theta_0^{ML}=tan^{-1}\{\frac{\sum_n sin\theta_n}{\sum_n cos\theta_n}\}$
这跟我们之前在二维笛卡尔空间的观测下求取的均值 $\hat \theta = tan^{-1}\{\frac{\sum_n sin\theta_n}{\sum_n cos\theta_n}\}$ 相同
求 $m$
这里没看懂，直接贴图

2.3.8 周期变量

2.3.8 周期变量

相关阅读更多精彩内容

友情链接更多精彩内容