Logistic Regression

1. 性质:线性分类器(通过y = \theta ^Tx的大小以及事先确定的阈值来进行分类)

2. 模型:\hat{y} = g(\theta ^Tx) = h_{\theta}(x) = \frac{1}{1+e^{-\theta ^T x}}

3. 求导:令z = \theta ^Tx,则g{’}(z) = \frac{d}{dz} \frac{1}{1+e^{-z}} = \frac{1}{(1+e^{-z})^2} \cdot (e^{-z}) = g(z) \cdot (1-g(z))

4. 损失函数:假设记P(y=1| x; \theta) = h_\theta (x)P(y=0 | x; \theta) = 1 - h_\theta (x),可合并记为P(y | x; \theta) = h_\theta (x)^y(1 - h_\theta (x))^{1-y}

要根据一组数据利用极大似然估计求出参数,则L(\theta ) = P(\vec{y}  | X; \theta ) = \prod\nolimits_{i=1}^n P(y^{(i)} | x^{(i)}; \theta ) = \prod\nolimits_{i=1}^n h_\theta (x{^{(i)}})^{y^{{(i)}}}  (1-h_\theta (x^{(i)}))^{1-y^{(i)}}

取其对数得

l(\theta ) = log L(\theta ) = \sum\nolimits_{i=1}^n y^{(i)}logh_\theta (x^{(i)}) + (1-y^{(i)})log(1-h_\theta (x^{(i)}))

5. 算法:gradient ascent(由于是要最大化估计)

首先求偏导,

\frac{\partial}{\partial \theta _j} l(w) = (y\frac{1}{g(\theta ^Tx)} - (1-y)\frac {1}{1-g(\theta ^Tx)}) \frac{\partial}{\partial \theta _j}g(\theta ^Tx) = (y-h_\theta (x))x_j

之后用随机梯度上升(SGA)更新参数,

\theta _j := \theta _j + \alpha (y^{(i)}-h_\theta (x^{(i)})) x^{(i)}_j

b := b + \alpha (y^{(i)} - h_\theta(x^{(i)}))

(2019/04/06 21:30)

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容