使用MetaboDiff包分析非靶向代谢组数据

最近手里有个非靶向代谢组的数据,通过学习MetaboDiff包来熟悉代谢组分析的思路和流程,接下来的流程来自于MetaboDiff包官方帮助文档

1. MetaboDiff包安装
library("devtools")
install_github("andreasmock/MetaboDiff")
library(MetaboDiff)

2. 数据处理
2.1数据的导入

MetaboDiff包需要三个数据:

  1. assay - 包含代谢物的相对丰度的数据矩阵;
  2. rowData -包含代谢物注释信息的数据 框;
  3. colData - 包含样本元数据的数据框。

MetaboDiff包自带的示例数据来自于这篇文献AKT1 and MYC Induce Distinctive Metabolic Fingerprints in Human Prostate Cancer。代谢组数据来自于61个前列腺癌病人和25个正常人的前列腺组织。
先查看一下这个三个数据。

> assay[1:5,1:5]
         pat1      pat2      pat3     pat4      pat5
met1 33964.73 117318.43 118856.90  78670.7 102565.94
met2 18505.56 167585.32  59621.97  66220.4  74892.27
met3       NA  42373.93  27141.21       NA  38390.78
met4 61638.77  74595.78        NA       NA        NA
met5       NA 148363.61  43861.79 105835.2  25589.08

> head(colData)
       id tumor_normal random_gender   group
pat1  cp2            N        female Control
pat2  cp7            N        female Control
pat3 cp19            N          male Control
pat4 cp26            N          male Control
pat5 cp29            N        female Control
pat6 cp32            N          male Control

> head(rowData)
                                    BIOCHEMICAL    SUPER_PATHWAY      SUB_PATHWAY METABOLON_ID
met1  1-arachidonoylglycerophosphoethanolamine*            Lipid        Lysolipid        35186
met2      1-arachidonoylglycerophosphoinositol*            Lipid        Lysolipid        34214
met3                      1-arachidonylglycerol            Lipid Monoacylglycerol        34397
met4      1-eicosadienoylglycerophosphocholine*            Lipid        Lysolipid        33871
met5 1-heptadecanoylglycerophosphoethanolamine* No Super Pathway       No Pathway        37419
met6       1-linoleoylglycerol (1-monolinolein)            Lipid Monoacylglycerol        27447
      PLATFORM KEGG_ID   HMDB_ID
met1 LC/MS neg    <NA> HMDB11517
met2 LC/MS neg    <NA>      <NA>
met3 LC/MS neg  C13857 HMDB11572
met4 LC/MS pos    <NA>      <NA>
met5 LC/MS neg    <NA>      <NA>
met6 LC/MS neg    <NA>      <NA>

#将三个数据集融合成一个以便于下游分析。
> (met <- create_mae(assay,rowData,colData))
A MultiAssayExperiment object of 1 listed
 experiment with a user-defined name and respective class. 
 Containing an ExperimentList class object of length 1: 
 [1] raw: SummarizedExperiment with 307 rows and 86 columns 
Features: 
 experiments() - obtain the ExperimentList instance 
 colData() - the primary/phenotype DataFrame 
 sampleMap() - the sample availability DataFrame 
 `$`, `[`, `[[` - extract colData columns, subset, or experiment 
 *Format() - convert into a long or wide DataFrame 
 assays() - convert ExperimentList to a SimpleList of matrices

2.2 代谢物的注释

如果HMDB、KEGG或ChEBI id是rowData数据集的一部分,则可以从小分子通路数据库(SMPDB)检索进行代谢产物注释。

> met <- get_SMPDBanno(met,
+                           column_kegg_id=6,
+                           column_hmdb_id=7,
+                           column_chebi_id=NA)

2.3 处理缺失值
> na_heatmap(met,
+            group_factor="tumor_normal",
+            label_colors=c("darkseagreen","dodgerblue"))

#剔除缺失值,计算代谢物的相对丰度。
> (met = knn_impute(met,cutoff=0.4))
A MultiAssayExperiment object of 2 listed
 experiments with user-defined names and respective classes. 
 Containing an ExperimentList class object of length 2: 
 [1] raw: SummarizedExperiment with 307 rows and 86 columns 
 [2] imputed: SummarizedExperiment with 238 rows and 86 columns 
Features: 
 experiments() - obtain the ExperimentList instance 
 colData() - the primary/phenotype DataFrame 
 sampleMap() - the sample availability DataFrame 
 `$`, `[`, `[[` - extract colData columns, subset, or experiment 
 *Format() - convert into a long or wide DataFrame 
 assays() - convert ExperimentList to a SimpleList of matrices

2.4 异常值热图

在标准化数据之前,我们需要剔除数据中的异常值。

> outlier_heatmap(met,
+                 group_factor="tumor_normal",
+                 label_colors=c("darkseagreen","dodgerblue"),
+                 k=2)

根据上述热图,设置了k=2, 热图形成了cluster1和cluster2,cluster1相对cluster2便是异常值,我们将剔除cluster1。

> (met <- remove_cluster(met,cluster=1))
harmonizing input:
  removing 5 sampleMap rows with 'colname' not in colnames of experiments
harmonizing input:
  removing 5 sampleMap rows with 'colname' not in colnames of experiments
  removing 5 colData rownames not in sampleMap 'primary'
A MultiAssayExperiment object of 2 listed
 experiments with user-defined names and respective classes. 
 Containing an ExperimentList class object of length 2: 
 [1] raw: SummarizedExperiment with 307 rows and 81 columns 
 [2] imputed: SummarizedExperiment with 238 rows and 81 columns 
Features: 
 experiments() - obtain the ExperimentList instance 
 colData() - the primary/phenotype DataFrame 
 sampleMap() - the sample availability DataFrame 
 `$`, `[`, `[[` - extract colData columns, subset, or experiment 
 *Format() - convert into a long or wide DataFrame 
 assays() - convert ExperimentList to a SimpleList of matrices

2.5 数据标准化
> (met <- normalize_met(met))
vsn2: 307 x 81 matrix (1 stratum). 
Please use 'meanSdPlot' to verify the fit.
vsn2: 238 x 81 matrix (1 stratum). 
Please use 'meanSdPlot' to verify the fit.
A MultiAssayExperiment object of 4 listed
 experiments with user-defined names and respective classes. 
 Containing an ExperimentList class object of length 4: 
 [1] raw: SummarizedExperiment with 307 rows and 81 columns 
 [2] imputed: SummarizedExperiment with 238 rows and 81 columns 
 [3] norm: SummarizedExperiment with 307 rows and 81 columns 
 [4] norm_imputed: SummarizedExperiment with 238 rows and 81 columns 
Features: 
 experiments() - obtain the ExperimentList instance 
 colData() - the primary/phenotype DataFrame 
 sampleMap() - the sample availability DataFrame 
 `$`, `[`, `[[` - extract colData columns, subset, or experiment 
 *Format() - convert into a long or wide DataFrame 
 assays() - convert ExperimentList to a SimpleList of matrices

2.6 数据标准化质控
> quality_plot(met,
+              group_factor="tumor_normal",
+              label_colors=c("darkseagreen","dodgerblue"))
harmonizing input:
  removing 243 sampleMap rows not in names(experiments)
harmonizing input:
  removing 243 sampleMap rows not in names(experiments)
harmonizing input:
  removing 243 sampleMap rows not in names(experiments)
harmonizing input:
  removing 243 sampleMap rows not in names(experiments)
Warning messages:
1: Removed 5356 rows containing non-finite values (stat_boxplot). 
2: Removed 5356 rows containing non-finite values (stat_boxplot). 

3. 数据分析
3.1 无监督分析

MetaboDiff包提供了线性降维方法PCA和非线性降维方法tSNE。

> source("http://peterhaschke.com/Code/multiplot.R")
> multiplot(
+   pca_plot(met,
+            group_factor="tumor_normal",
+            label_colors=c("darkseagreen","dodgerblue")),
+   tsne_plot(met,
+             group_factor="tumor_normal",
+             label_colors=c("darkseagreen","dodgerblue")),
+   cols=2)
sigma summary: Min. : 0.486945518988849 |1st Qu. : 0.714292832194587 |Median : 0.752934663223126 |Mean : 0.75914557339073 |3rd Qu. : 0.808081774279559 |Max. : 0.939549187337462 |
Epoch: Iteration #100 error is: 18.6145995899728
Epoch: Iteration #200 error is: 1.54407709770312
Epoch: Iteration #300 error is: 1.22290267643501
Epoch: Iteration #400 error is: 1.11106327484334
Epoch: Iteration #500 error is: 1.03658104678225
Epoch: Iteration #600 error is: 0.976566767973725
Epoch: Iteration #700 error is: 0.951849496540308
Epoch: Iteration #800 error is: 0.93612964053674
Epoch: Iteration #900 error is: 0.914421902208305
Epoch: Iteration #1000 error is: 0.88283039690459

3.2 假设检验

对单个代谢物进行差异分析,主要用T检验和ANOVA分析。

> met = diff_test(met,
+                 group_factors = c("tumor_normal","random_gender"))
> str(metadata(met), max.level=2)
List of 2
 $ ttest_tumor_normal_T_vs_N         :'data.frame': 238 obs. of  3 variables:
  ..$ pval       : num [1:238] 0.0206 0.7808 0.0832 0.0432 0.5859 ...
  ..$ adj_pval   : num [1:238] 0.102 0.904 0.221 0.158 0.758 ...
  ..$ fold_change: num [1:238] 0.2872 0.0366 -0.3936 -0.5391 -0.1646 ...
 $ ttest_random_gender_male_vs_female:'data.frame': 238 obs. of  3 variables:
  ..$ pval       : num [1:238] 0.2318 0.8626 0.4048 0.0121 0.2111 ...
  ..$ adj_pval   : num [1:238] 0.83 0.959 0.862 0.386 0.83 ...
  ..$ fold_change: num [1:238] -0.1372 -0.0208 0.1742 0.607 0.3438 ...
#以tumor和normal分组进行差异分析
> volcano_plot(met, 
+              group_factor="tumor_normal",
+              label_colors=c("darkseagreen","dodgerblue"),
+              p_adjust = FALSE)
> volcano_plot(met, 
+              group_factor="tumor_normal",
+              label_colors=c("darkseagreen","dodgerblue"),
+              p_adjust = TRUE)


#以female和male分组进行差异分析
> par(mfrow=c(1,2))
> volcano_plot(met, 
+              group_factor="random_gender",
+              label_colors=c("brown","orange"),
+              p_adjust = FALSE)
> volcano_plot(met, 
+              group_factor="random_gender",
+              label_colors=c("brown","orange"),
+              p_adjust = TRUE)

3.3 代谢物关联网络分析

相关分析被成功应用在比较转录组分析中揭示具生物学意义的模块的变化情况。同样是思路也可以应用于代谢组数据分析中。

> met_example <- met_example %>%
+   diss_matrix %>%    #构建相异矩阵
+   identify_modules(min_module_size=5) %>%  #鉴定代谢相关模块
+   name_modules(pathway_annotation="SUB_PATHWAY") %>%  #代谢相关模块命名
+   calculate_MS(group_factors=c("tumor_normal","random_gender")) #根据样本性状计算模块之间关联的显著性

alpha: 1.000000
 ..cutHeight not given, setting it to 0.991  ===>  99% of the (truncated) height range in dendro.
 ..done.
#代谢相关模块可视化,分级聚类
> WGCNA::plotDendroAndColors(metadata(met_example)$tree, 
+                            metadata(met_example)$module_color_vector, 
+                            'Module colors', 
+                            dendroLabels = FALSE, 
+                            hang = 0.03,
+                            addGuide = TRUE, 
+                            guideHang = 0.05, main='')

#代谢相关模块可视化,各模块直接的关系
> par(mar=c(2,2,2,2))
> ape::plot.phylo(ape::as.phylo(metadata(met_example)$METree),
+                 type = 'fan',
+                 show.tip.label = FALSE, 
+                 main='')
> ape::tiplabels(frame = 'circle',
+                col='black', 
+                text=rep('',length(unique(metadata(met_example)$modules))), 
+                bg = WGCNA::labels2colors(0:21))

#代谢相关模块命名,可视化
> ape::plot.phylo(ape::as.phylo(metadata(met_example)$METree), cex=0.9)

#癌症样本和正常样本对应的模块之间的关联显著性,可视化
> MS_plot(met_example,
+         group_factor="tumor_normal",
+         p_value_cutoff=0.05,
+         p_adjust=FALSE)
#不同性别样本对应的模块之间的关联显著性,可视化
> MS_plot(met_example,
+         group_factor="random_gender",
+         p_value_cutoff=0.05,
+         p_adjust=FALSE)

#相关模块中单个代谢产物在不同样品中的差异性检验
> MOI_plot(met_example,
+          group_factor="tumor_normal",
+          MOI = 2,
+          label_colors=c("darkseagreen","dodgerblue"),
+          p_adjust = FALSE) + xlim(c(-1,8))

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 213,099评论 6 492
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 90,828评论 3 387
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 158,540评论 0 348
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 56,848评论 1 285
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 65,971评论 6 385
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,132评论 1 291
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,193评论 3 412
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 37,934评论 0 268
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,376评论 1 303
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,687评论 2 327
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,846评论 1 341
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,537评论 4 335
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,175评论 3 317
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,887评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,134评论 1 267
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,674评论 2 362
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,741评论 2 351

推荐阅读更多精彩内容