倾向性评分匹配(Propensity Score Matching, PSM)
步骤包括:
- Preliminary analysis;
- Estimation of Propensity Score;
- Propensity Score Matching; *****
- Outcome analysis;
- Sensitivity analysis
准备工作:
- 数据清洗:将按照一定纳入标准和排除标准的病例进行编号,变量命名(英文)、赋值,切记去掉含有缺失值的病例,否则R运行会出错。将文件存为.csv格式。
#install.pakeges('MatchIt')
library(MatchIt)
mydata <- read.csv ("C:/tumor/R-data.csv")
attach (mydata)
mydata [1:20,]
#匹配方法采用nearest,1:1匹配。也可以根据样本量进行1:2,1:3匹配等,一般不超过5,直接更改ratio=后面的数字即可。
m.out = matchit (radio ~ sex + age + margin + lymph + differentiated + disease + Tu + M + site + neck + targeted,
method ="nearest", ratio =1)
#数字、振动图、直方图的形式客观和直观展现匹配前后的情况以评估匹配效果
summary (m.out)
plot (m.out, type = "jitter")
plot (m.out, type = "hist")
m.data1 <- match.data (m.out)
write.csv (m.data1, file = "C:/tumor/match_nearest.csv)
匹配方法除了nearest以外,还可以采用:
- Exact Matching:病例组和对照组在每一变量上精确匹配,参数值完全相同。当协变量较多或者协变量取值范围较大时不宜采用。(method = "exact")
- Subclassification:将数据集分成子集,子集内协变量的分布相同。(method = "subclass")
- Optimal Matching:所有匹配病例之间的平均绝对距离最小,需要安装optmatch包。(method = "optimal")
- Genetic Matching:利用遗传学计算算法匹配,需安装Matching包。(method = "genetic")
- Coarsened Exact Matching:在确保其他协变量平衡下匹配某一协变量。(method = "cem")