Chapter2:决策论和贝叶斯的参数推断(Decision theory and Bayesian parameter inference)

这一章提出了这些新的概念:
loss function损失函数 L : Θ × D → [0, +∞).和likelihood function有点区别,就是计算cost"不精确度)"的函数
decision rule 判定准则 δ : X → D.
两者放在一起就是, L(Θ,δ(X)),先用判定准则得出"D(estimator)", 根据损失函数得出,此estimator会产生的cost
δ is an estimator of the unknown parameter θ

Often we address the reverse question; that is, for which (if any) loss function L is the decision rule δ optimal?
通常情况下,我们会解决逆问题:已知loss function的情况下,什么decision rule可以使cost降到最低

Frequentist riskR(θ, δ) = \int_X L(θ, δ(x))f(x|θ)dx
其实就是一种已知的loss function L : Θ × D → [0, +∞)\

Frequentist risk, Bayesian expected loss, and Bayes risk : https://www.bilibili.com/video/av73757778
如何简单理解贝叶斯决策理论(Bayes Decision Theory)?: https://www.zhihu.com/question/27670909/answer/1656784818

Admissible

Definition 2.3 A decision rule δ is admissible if there exists no decision rule δ_0 such that R(θ, δ_0 ) ≤ R(θ, δ), ∀θ ∈ Θ with the above inequality being strict for at least one θ ∈ Θ.
这个admissible的概念其实只是基础的概念

Admissibility and Stein’s result

Not admissible

Theorem 2.1 Let Θ = R^d, f(· |θ) be the probability density function of the N_d(θ, I_d)distribution and L : Θ × Θ → [0, +∞) be the quadratic loss function. Then, when d ≥ 3, the maximum likelihood estimator (MLE) δ_0, defined by δ_0(x) = x, x ∈ Rd , is not admissible.
Proof: See Appendix 1.
在admissible的概念基础上,产生了更高一级的not admissible的例外情况
意味着当数据维度大于3且拥有normal 分布时,原本的quadratic loss funtion得出的极大似然估计值不是一个optimal estimator。

这里需要思考为什么不是
如果把两个不相关的数据集合并,要是维度超过了3,那么原本两个数据集的mle将不再optimal,因为他们不再admissible
因此,又在Not admissible的抽象概念上,又可以引申出Alternative loss functions 和 Non-Gaussian models两个概念。

The main message of this theorem is that there are no general guarantees that the MLE is admissible.
However, it can be shown that, in the set-up of Theorem 2.1, δ_0 is minimax for any d ≥ 1 and is asymptotically efficient, and therefore inadmissible estimators are not necessarily bad estimators.
To sum-up: Admissible estimators are not necessarily good estimators and inadmissible estimators are not necessarily bad estimators.
这里是在说,admissible estimator和good estimator之间并不是一个绝对的正联系。

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容