NC论文图复现-优雅绘制多分组热图-开源代码和数据

米妮

文章信息如图,今天学习复现的图表文章为最近发表的论文~

复刻图来源的文章

文章概要:在蓝藻聚集体内,蓝藻与共生细菌之间的物种间氢转移被发现是氮损失的新途径。蓝藻能够在昼夜周期下生成H2,作为邻近氢营养脱氮菌的电子供体。工程化蓝藻聚集体的氢营养脱氮速率为3.47 ± 0.42 mmol l–1 day–1,几乎占到异养脱氮速率的50%。在63个全球分布的蓝藻聚集体中,84%存在H2生成蓝藻和氢营养脱氮菌。研究表明,蓝藻繁盛期间的物种间氢转移可能导致氮损失率过高(对文章感兴趣自己看哦,Interspecies hydrogen transfer between cyanobacteria and symbiotic bacteria drives nitrogen loss)。

学习复现的图为 Fig. 6e

论文原图
R 代码复刻图(其他美化细节需要使用画图软件 AI 喽)

这张图展示了一个热图(heatmap),用于比较不同生态系统(如工程系统、湖泊、海洋、温泉和冰川)中微生物代谢过程的基因丰度。热图中的每个方块代表一个特定基因在特定生态系统中的相对丰度,颜色的深浅表示丰度的高低,颜色越深表示丰度越高。

此图其他适用场景:不同分组病人(样本)在感兴趣的几个通路中的一些重要基因的表达比较(把分组信息定义成自己感兴趣想比较的组就可以)。

生成模拟数据

#### 生成模拟数据 尽量 复现 原图 ####
library(writexl)
set.seed(123)

# 样本分组与命名
samples <- c(
  paste0("W", 1:18),   # Engineered system
  paste0("L", 1:19),   # Lake
  paste0("M", 1:10),   # Marine
  paste0("H", 1:12),   # Hot spring
  paste0("G", 1:10)    # Glacier
)
group2 <- c(
  rep("Engineered system", 18),
  rep("Lake", 19),
  rep("Marine", 10),
  rep("Hot spring", 12),
  rep("Glacier", 10)
)

# 基因名(可自定义或用下方示例)
genes <- c(
  "Aerobic respiration-CcoN", "Aerobic respiration-CoxA", "Aerobic respiration-CydA", "Aerobic respiration-CyoA",
  "Nitrate reduction-NapA", "Nitrate reduction-NarG", "DNRA-NrfH", "Nitrite reduction-NirK",
  "Nitric oxide reduction-NorB", "Dinitrogen fixation-NifH", "H2-uptake hydrogenase", "H2-evolving hydrogenase",
  "H2-bidirectional hydrogenase", "Arsenate reduction-ArsC", "Sulfate reduction-Sat", "Sulfate reduction-AprA",
  "Ferric iron reduction-MtrA", "Methane production-McrA", "Ferrous iron oxidation-Cyc2", "Thiosulfate oxidation-SoxB",
  "Sulfide oxidation-Sqr", "Sulfide oxidation-DsrA", "Formate oxidation-FdhA", "PSI reaction centre-PsaA",
  "PSII reaction centre-PsbA", "Light proton pumping-RHO"
)

n_gene <- length(genes)
n_sample <- length(samples)

# group2(功能分组)可随机分配或自定义
gene_group2 <- rep(c("Oxygen metabolism", "Nitrogen metabolism", "Hydrogen metabolism", "Other electron acceptors", "Other electron donors", "Phototrophy"), length.out = n_gene)

# 构建数据框
df <- data.frame(
  Gene = genes,
  group2 = gene_group2
)

# 添加样本数据(0~1之间的随机数)
for (s in samples) {
  df[[s]] <- runif(n_gene)
}

# 写入Excel
write_xlsx(list("Fig.6e" = df), "multiGroup_heatmap_data.xlsx")

# 模拟 group.tsv
group_df <- data.frame(
  Sample = samples,
  group = group2
)
readr::write_tsv(group_df, "multiGroup_heatmap_group.tsv")

(自己有数据的话根据生成的模拟数据格式,把自己的数据替代模拟数据就可以了)

读入模拟数据

library(tidyverse)
library(readxl)
library(ggplot2)
library(ggh4x)
library(ggtext)

df <- read_excel("multiGroup_heatmap_data.xlsx",sheet ="Fig.6e") %>% 
  select(1:59) %>% 
  pivot_longer(-c(Gene,group2)) %>% 
  mutate(name=str_replace_all(name,"WW","W")) %>% 
  left_join(.,read_tsv("multiGroup_heatmap_group.tsv") %>% 
              mutate(Sample=str_replace_all(Sample,"WW","W")),
            by=c("name"="Sample"))

# 定义因子
df$name <- factor(df$name,levels = unique(df$name))
df$Gene <- factor(df$Gene,levels = rev(unique(df$Gene)))
df$group <- factor(df$group,levels = unique(df$group))
df$group2 <- factor(df$group2,levels = unique(df$group2))

# 分面条带颜色
strips <- strip_themed(
  background_y =elem_list_rect(fill=c("#EC7A05","#9AC2CD99","#82B7F590","#5A77D190",
                          "#FAD77B","#16419480")),
  background_x=elem_list_rect(fill=c("#5A77D1","#44007A","#9AC2CD","#6B9B3C",
                                     "#82B7F5")))

ggplot画图

ggplot(df,aes(name,Gene,fill=value))+
  geom_tile(color="grey80")+
  facet_grid2(group2~group,scale="free",
              switch="y",strip=strips) +
  force_panelsizes(cols=c(13,13,9,12,10),
                   rows = c(4,6,3,5,5,3),respect=F)+
  scale_x_discrete(expand = c(0,0),
                   position = "top")+
  scale_y_discrete(expand = c(0,0))+
  scale_fill_gradientn(
    colors = c("white", "#e0f3f3", "#66b2a3", "#004c4c", "black"),
    values = scales::rescale(c(0, 0.01, 0.6, 0.95, 1.01)),
    limits = c(0, 1),
    oob = scales::squish,
    breaks = c(0, 0.3, 0.6, 0.9, 1),
    labels = c("0.0", "0.3", "0.6", "0.9", ">1.0")
  )+
  labs(x=NULL,y=NULL) +
  coord_cartesian(clip="off") +
  theme_test() +
  theme(axis.text.x=
          element_text(color="black",angle = 45,
                       vjust=0.5,hjust =0,size=7),
        axis.text.y=element_text(color="black",size=7),
        strip.placement = "outside",
        axis.ticks = element_blank(),
        panel.spacing.x = unit(0,"cm"),
        panel.spacing.y = unit(0,"cm"),
        strip.text.x = 
          element_text(color="white",size=10,face="bold"),
        strip.text.y = element_markdown(color="black",size=7),
        strip.background = element_rect(color="white"),
        legend.background = element_blank(),
        legend.title = element_blank(),
        legend.key.height = unit(1,"null"))

以上代码运行完以后就是得到复刻的图了:

复刻图

其他注意事项

Error in validate_element_list(background_x, "element_rect") : 
  object 'is_theme_element' not found

如果报错,可能是 ggh4x包版本较老,或者和ggplot2版本不兼容。

解决方法

升级 ggh4x 和 ggplot2

请在R中运行以下命令,升级到最新版:

install.packages("ggplot2")
install.packages("ggh4x")

当前运行成功的参考代码环境:

> sessionInfo()
R version 4.3.3 (2024-02-29)
Platform: x86_64-apple-darwin20 (64-bit)
Running under: macOS 15.4.1

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Asia/Shanghai
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils    
[5] datasets  methods   base     

(比较重要的是这一块环境)
other attached packages:
 [1] writexl_1.5.4   ggtext_0.1.2   
 [3] ggh4x_0.3.1     readxl_1.4.3   
 [5] lubridate_1.9.3 forcats_1.0.0  
 [7] stringr_1.5.1   dplyr_1.1.4    
 [9] purrr_1.0.2     readr_2.1.5    
[11] tidyr_1.3.1     tibble_3.2.1   
[13] ggplot2_3.5.2   tidyverse_2.0.0

loaded via a namespace (and not attached):
 [1] utf8_1.2.4        generics_0.1.3   
 [3] xml2_1.3.6        stringi_1.8.4    
 [5] hms_1.1.3         magrittr_2.0.3   
 [7] grid_4.3.3        timechange_0.3.0 
 [9] cellranger_1.1.0  fansi_1.0.6      
[11] scales_1.3.0      cli_3.6.3        
[13] rlang_1.1.4       crayon_1.5.3     
[15] bit64_4.5.2       munsell_0.5.1    
[17] commonmark_1.9.2  withr_3.0.2      
[19] tools_4.3.3       tzdb_0.4.0       
[21] colorspace_2.1-1  vctrs_0.6.5      
[23] R6_2.5.1          lifecycle_1.0.4  
[25] bit_4.5.0         vroom_1.6.5      
[27] pkgconfig_2.0.3   pillar_1.9.0     
[29] gtable_0.3.6      glue_1.8.0       
[31] Rcpp_1.0.13-1     xfun_0.49        
[33] tidyselect_1.2.1  rstudioapi_0.17.1
[35] farver_2.1.2      compiler_4.3.3   
[37] markdown_1.13     gridtext_0.1.5 
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容