【Rna-seq 分析流程】06. 列线图的构建

1. 简介

列线图(Alignment Diagram),又称诺莫图(Nomogram图)
根据多因素回归分析的结果,将多个预测指标整合并可视化到同一图片上来展示预测模型中各个变量之间的相互关系。


2. 数据信息

根据筛选出来的关键基因,提取出对应的训练集基因表达表中的数据,并将其中的样本(AAV 和 Control)分组标记。


3. 思路

  1. 使用R包“rms”构建了关键基因的列线图预测模型,并计算关键基因的得分。
  2. 使用R包“rms”绘制校准曲线。
  3. 进行决策曲线分析,并可视化。

4. 代码

library(rms)
library(rmda)
library(ggplot2)
library(data.table)

##----- 1.数据 -------
Nomo_data <- as.data.frame(t(hub_train_genes[ rownames(hub_train_genes) %in% valid_select_sign_gene$significant_Gene$Gene, ]))
Nomo_data <- column_to_rownames(merge(rownames_to_column(Nomo_data,"Sample"), train_group, by = "Sample", all.x = T),"Sample")
Nomo_data$label <- ifelse(Nomo_data$label == "AAV", 1, 0)

##----- 2.绘制列线图 ------
ddist <- datadist(Nomo_data) 
options(datadist = "ddist")  
model <- lrm(label ~ VCAN + CD74, data = Nomo_data, x = TRUE, y = TRUE)

nomogram <- nomogram(model, fun = plogis, funlabel = "Risk of AAV")


png("06.nomogram_plot.png") 
plot(nomogram)  
dev.off()  

pdf("06.nomogram_plot.pdf", width = 10, height = 7) 
plot(nomogram) 
dev.off() 


Nomo_data$GeneScore <- predict(model, type = "lp")
head(Nomo_data$GeneScore)


##----- 3.校准曲线 --------
calibration <- calibrate(model, method = "boot", B = 1000) 

png("06.calibration.png") 
plot(calibration)
dev.off()  

pdf("06.calibration.pdf", width = 10, height = 7) 
plot(calibration)
dev.off() 


##----- 4.决策曲线 --------
dca_result <- decision_curve(label ~ VCAN + CD74, data = Nomo_data, family = binomial(link = "logit"), 
                             thresholds = seq(0, 1, by = 0.01), confidence.intervals = TRUE)

plot_decision_curve(dca_result, curve.names = "Nomogram Model", 
                    xlab = "Threshold Probability", ylab = "Net Benefit", 
                    cost.benefit.axis = FALSE, confidence.intervals = TRUE)

dca_vcan <- decision_curve(label ~ VCAN, data = Nomo_data, family = binomial(link = "logit"), 
                           thresholds = seq(0, 1, by = 0.01), confidence.intervals = TRUE)

dca_cd74 <- decision_curve(label ~ CD74, data = Nomo_data, family = binomial(link = "logit"), 
                           thresholds = seq(0, 1, by = 0.01), confidence.intervals = TRUE)

p_decision_curve<-plot_decision_curve(list(dca_vcan, dca_cd74, dca_result), 
                    curve.names = c("VCAN", "CD74","Nomogram Model"), 
                    xlab = "Threshold Probability", ylab = "Net Benefit", 
                    cost.benefit.axis = FALSE, confidence.intervals = TRUE)

png("06.decision_curve.png") 
plot_decision_curve(list(dca_vcan, dca_cd74, dca_result), 
                    curve.names = c("VCAN", "CD74","Nomogram Model"), 
                    xlab = "Threshold Probability", ylab = "Net Benefit", 
                    cost.benefit.axis = FALSE, confidence.intervals = TRUE)
dev.off()  

pdf("06.decision_curve.pdf", width = 10, height = 7) 
plot_decision_curve(list(dca_vcan, dca_cd74, dca_result), 
                    curve.names = c("VCAN", "CD74","Nomogram Model"), 
                    xlab = "Threshold Probability", ylab = "Net Benefit", 
                    cost.benefit.axis = FALSE, confidence.intervals = TRUE)
dev.off() 

5. 结果展示

06.nomogram_plot.png

06.calibration.png

06.decision_curve.png
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容