贝叶斯决策理论

Bayesian

Bayes's Theorem

P(A|B) = \frac{P(B|A)P(A)}{P(B)}
prior: P(\omega)
likelihood: P(x|\omega)
posterior: P(\omega_i|x) = \frac{P(x|\omega_i)P(\omega_i)}{P(x)} = \frac{P(x|\omega_i)P(\omega_i)}{\sum_{j=1}^k P(x|\omega_j)P(\omega_j)}

Optimal Bayes Decision Rule: minimize the probability of error.
    if P(\omega_1|x) > P(\omega_2|x) then True state of nature =\omega_1;
    if P(\omega_1|x) < P(\omega_2|x) then True state of nature =\omega_2.

Prove: For a particular x,
        P(error|x) = P(\omega_1|x) if we decide \omega_2;
        P(error|x) = P(\omega_2|x) if we decide \omega_1.
Bayes Decision Rule:Decide \omega_1 if P(\omega_1|x) > P(\omega_2|x);otherwise decide \omega_2.
Therefore: P(error|x) = min[P(\omega_1|x),P(\omega_2|x)].
The unconditional error P(error) obtained by integration over all P(error|x).

Bayesian Decision Theory

c state of nature: \{\omega_1,,\omega_2,\cdots,\omega_c\}
a possible actions: \{\alpha_1,\alpha_2,\cdots,\alpha_a\}
the loss for taking action \alpha_i when the true state of nature is \omega_j: \lambda(\alpha_i|\omega_j)
R(\alpha_i|x) = \sum_{j=1}^{c}\lambda(\alpha_i|\omega_j)P(\omega_j|x)
Select the action for which the conditional risk R(\alpha_i|x) is minimum.
Bayes Risk: R = \sum_{over x} R(\alpha_i|x).

  • Example 1:
    action \alpha_1: deciding \omega_1
    action \alpha_2: deciding \omega_2
    \lambda_{ij} = \lambda(\alpha_i|\omega_j)
    R(\alpha_1|x) = \lambda_{11}P(\omega_1|x) + \lambda_{12}P(\omega_2|x)
    R(\alpha_2|x) = \lambda_{21}P(\omega_1|x) + \lambda_{22}P(\omega_2|x)
    if R(\alpha_1|x) < R(\alpha_2|x) , action \alpha_1 is taken: deciding \omega_1.
  • Example 2:
    Suppose \lambda\left(\alpha_{i} | \omega_{j}\right)=\left\{\begin{array}{ll}{0} & {i=j} \\ {1} & {i \neq j}\end{array}\right.
    Conditional risk
    R\left(\alpha_{i} | x\right)=\sum_{j=1}^{c} \lambda\left(\alpha_{i} | \omega_{j}\right) P\left(\omega_{j} | x\right) =\sum_{j \neq i} P\left(\omega_{j} | x\right)=1-P\left(\omega_{i} | x\right)
    Minimizing the risk \longrightarrow Maximizing the posterior P(\omega_i|x).
    So we have the discriminant function(max. discriminant corresponds to min. risk):
    g_{i}(x)=-R\left(\alpha_{i} | x\right)
    \Longleftrightarrow
    g_{i}(x)=P\left(\omega_{i} | x\right)
    g_{i}(x)=P(x | \omega_{i}) P\left(\omega_{i}\right)
    g_{i}(x)=\ln P(x | \omega_{i})+\ln P\left(\omega_{i}\right)
    Set of discriminant functions: g_{i}(x), i=1, \cdots, c
    Classifier assigns a feature vector x to class \omega_i if: g_{i}(x)>g_{j}(x), \quad \forall j \neq i

Binary classification \longrightarrow Multi‐class classfication

  • One vs. One
    N class, design \frac{N(N-1)}{2} classifiers, denote for result.
  • One vs. Rest
    design N classifiers, choose the one which prediction is positive.
  • ECOC (Error‐Correcting Output Codes)
    The code consisting of the labels predicted by these classifiers is compared with each line, and the one with the smallest distance between codes is the result.
f1 f2 f3
c1 -1 1 -1
c2 1 -1 -1
c3 -1 1 1
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • rljs by sennchi Timeline of History Part One The Cognitiv...
    sennchi阅读 7,458评论 0 10
  • 年青时曾经特别喜欢浦先生的, “有志者事竟成,破釜沉舟,百二秦关终属楚; 苦心人天不负,卧薪尝胆,三千越甲可吞吴!...
    嘉渔阅读 241评论 0 0
  • 滿心期待的吃了我最愛的雄女早餐喝了紅茶牛奶,還陪了閨蜜的妹妹一起上課,感覺到元氣滿滿就這麼啟程了,到了火車站閨蜜的...
    不願被世界改變阿宣阅读 233评论 0 2
  • 想要写好作文,不仅要学会许多修辞手法。更重要的是有才高八斗,学富车的知识,和许多的好词好句。可是,这些好词好句从哪...
    泽楷0927阅读 211评论 0 0
  • 每天只需进步一点点。。。。。。
    熊孩子CEO阅读 306评论 0 2