A hypothesis test is a procedure that allows us to "confidently" reject a hypothesis if it is clearly statistically inconsistent with data.
1. 基本概念
1.1 假设检验四要素及基本过程
四要素:
- Null hypothesis (原假设)
- Alternative hypothesis (备择假设)
or
- Test statistics (统计量)
- Rejection region (拒绝域)
基本过程:
- 根据试验需要,提出原假设和备择假设
- 收集试验数据,计算统计量
- 若统计量落在拒绝域内,则拒绝原假设;否则,无法拒绝原假设。
Note 1:关于和
-
和
并不是互补的或对称的。
可以包含关于总体分布的一切使
不成立的命题。
- 在实际操作中,通常将希望予以拒绝的假设作为
,而将希望予以支持的假设作为
.
This is because hypothesis tests are designed to avoid rejecting when it is true. Therefore when the test rejects
, one can be quite sure that
is false. 这里涉及到下面要说的“假设检验中的两类错误”。
Note 2:关于统计量
- 根据检验目标(均值、方差...)的不同,会使用不同的统计量。
- 其理论根据源于:中心极限定理,正态分布的性质,Likelihood Ratio Test, Pearson's
test等,具体见下。
Note 3:关于拒绝域
- 拒绝域是在
成立的前提下,通过事先确定的显著性水平
以及
,计算出来的一个区间。
- 它代表的是一个小概率事件。
- 如果这个小概率事件发生了,则说明原假设
在大概率上是错误的,于是我们拒绝原假设。
1.2 假设检验两类错误
Type I Error: rejecting
when it is true
避免这类错误是首要
用
表示犯这类错误的概率
也被称作significance level(显著性水平)
Type II Error: not rejecting
when it is false
用
表示犯这类错误的概率
被称作检验的 power
Type I error 和 Type II error 的关系:
We can always reduce the type I error by making the rejection region smaller. This will typically at the expense of larger type II error.
In practice,we want to have powerful tests with a given type I error.
1.3 P-values
The P-value is the smallest for which the given observed data (once you have done the random experiment) suggests rejection of
Smaller P-value indicates rejection of the null hypothesis.
2. 常用假设检验及其原理
2.1 中心极限定理 Central Limit Theorem
are independently and identically distributed, with
and
known. Then
2.1.1 大样本均值检验
假设: To test the hypothesis
against one of these alternative hypothesis:
; or
; or
统计量:
拒绝域 (RR):
Defineas
where
. Then
(1) for, the RR is
(2) for, the RR is
(3) for, the RR is
Note: If the variance (总体方差) is unknown, you can replace it by
(样本方差), since
is large.
2.1.2 小样本均值检验
小样本情况下,上述CLT中的正态分布可以用分布近似,即
假设:同上
统计量:
拒绝域:
Defineas
where
. Then
(1) for, the RR is
(2) for, the RR is
(3) for, the RR is
2.2 正态分布的性质
, then
2.2.1 正态分布均值检验
过程同2.1.1 大样本均值检验
2.2.2 正态分布方差检验
假设: To test the hypothesis
against one of these alternatives:
统计量:
拒绝域:
Defineand
as
where. Then
(1) for , the RR is
(2) for , the RR is
(3) for , the RR is
or
2.3 似然比 Likelihood Ratio Tests
, then we have
(1) The likelihood of is
(2) Suppose ,
where are some sets of possible parameter values and
.
Define generalized likelihood ratio as
where is the dimension of parameter space
and
is the dimension of parameter space
Note: 计算时,涉及到 Maximum Likelihood Estimator.
假设:
,
统计量:
拒绝域:
2.4 卡方检验 Pearson's
test
2.4.1
test of multinomial data
Suppose each individual's category is a multinomial draw with probability .
Let be the number of observed individuals in each category. Then
Let be the simplex, i.e.
.
The maximum likelihood estimator (MLE) over all is:
vs
Under and using MLE, we can get the expected number for each category as
. Then
假设:
,
统计量:
拒绝域:
Note: While we could apply a likelihood ratio test here, Pearson's test has a bit more power.
2.4.2
test of independence
检验两个分类变量是否相互独立。
Suppose we have observed an contingency table.
Col1 | Col 2 | ... | Col c | |
---|---|---|---|---|
Row 1 | .. | |||
Row 2 | ... | |||
... | ... | ... | ... | ... |
Row r | ... |
- 假设:
row and column variables are independent.
row and column variables are dependent.
Under we have following contingency table:
Col1 | Col 2 | ... | Col c | |
---|---|---|---|---|
Row 1 | .. | |||
Row 2 | ... | |||
... | ... | ... | ... | ... |
Row r | ... |
The MLEs for are
Then we can get expected number of individuals for each category.
统计量:
拒绝域:
with