CRISPR-KO-KEGG-GO富集分析

CRISPR-KO-KEGG-GO富集分析

代码如下:
三个R文件:

  1. Func-CRIPSR (Func_CRISPR_20250519-Huh7-RAG.R)
  2. Func-KEGG-GO (Func_KEGG_GO_20250520.R)
  3. use (use_vol_CRISPR-20250519.R)

起始文件为Mageck分析好的*.gene_summary.txt文件
如下图:


*.gene_summary.txt

内容示例如下:


Ctrl_Mock.gene_summary.txt 部分信息

切换工作目录到该系列文件所在路径
打开use文件3,source文件1和2,然后循环读取*.gene_summary.txt文件,并进行绘图

具体思路

循环读取*.gene_summary.txt文件
------CRISPR函数
-------------取前缀名vsname
-------------设置阈值P,FC
-------------提取成.CSV文件(两个文件,全部基因或差异基因)
-------------EnhancedVolcano绘制火山图(阈值函数内设定,与上述独立)
-------------创建文件夹,保存RDS,保存volcano
-------------返回此次.csv文件名(差异基因,只含up down)
------KEGG_GO函数
-------------读取CSV文件,获取up基因(CRISPR阴性筛选,其中只up有意义)
-------------提取文件名前缀作为文件夹名(注:CRISPR函数已经创建)
-------------KEGG分析,多个阈值测试(0.01, 0.05, 0.1, 0.2)
-------------KEGG绘图(bar,dot,emap)只用10个通路,保存PDF
-------------GO富集分析,p值测试(0.01, 0.05, 0.1, 0.2),达到10个通路停止,如无,选用0.2
-------------绘图只用10个通路(BP,MF,CC三个dotplot),保存PDF


部分结果如下:

KEGG_dotplot

KEGG_barplot

KEGG_emap

GO-BP

GO-MF

GO-CC

全部具体代码如下
  1. CRISPR绘制火山图,并提取up down基因代码,在如下文件中
    Func_CRISPR_20250519-Huh7-RAG.R
library(tidyverse)
library(ggrepel)
set.seed(42)
require(EnhancedVolcano)


Vol_CRISPR <- function(file_path,  pC, FCC){
  # 指定CRISPR-MAGEECK分析的Counts 表格数据文件路径
  # file_path <- "E:/Coding/R_gzlab_docu/CrisprNGS/CRISPR-AQY/20250515/fenxi/Ctrl_Mock.gene_summary.txt"
  # file_path <- "E:/Coding/R_gzlab_docu/CrisprNGS/CRISPR-AQY/20250515/fenxi/R2_Mock.gene_summary.txt"
  # file_path <- "E:/Coding/R_gzlab_docu/CrisprNGS/CRISPR-AQY/20250515/fenxi/R5_Mock.gene_summary.txt"
  # file_path <- "E:/Coding/R_gzlab_docu/CrisprNGS/CRISPR-AQY/20250515/fenxi/R2_Ctrl.gene_summary.txt"
  # file_path <- "E:/Coding/R_gzlab_docu/CrisprNGS/CRISPR-AQY/20250515/fenxi/R5_Ctrl.gene_summary.txt"
  # file_path <- "E:/Coding/R_gzlab_docu/CrisprNGS/CRISPR-AQY/20250515/fenxi/R5_R2.gene_summary.txt"
  
  
  
  # 读取文件
  data <- read.table(file_path,header=T,na.strings = c("NA"))
  
  # 获取文件名
  get_first_part <- function(file_path) {
    # 获取文件名(含扩展名)
    file_name <- basename(file_path)
    # 使用 strsplit 按第一个点分割
    parts <- strsplit(file_name, "\\.", fixed = FALSE)[[1]]
    # 返回第一个部分
    return(parts[1])
  }
  vsname <- get_first_part(file_path)  # 输出: "file"
  vsname
  
  
  
  
  
  
  # 设置参数cutoff
  # FC_cutoff = 1.333
  # P_cutoff = 0.01
  FC_cutoff = 1
  P_cutoff = 0.01
  
  
  data[which(abs(data$pos.lfc) <= FC_cutoff | data$pos.score >= P_cutoff),'sig'] <- 'none'
  data[which(data$pos.lfc >= FC_cutoff & data$pos.score < P_cutoff),'sig'] <- 'up'
  data[which(data$pos.lfc <= -FC_cutoff & data$pos.score < P_cutoff),'sig'] <- 'down'
  
  #输出选择的差异基因总表
  data_select <- subset(data, sig %in% c('up', 'down'))
  dim(data_select)
  
  #保存为csv--差异基因排序
  TwoGroupFileName = paste0(vsname,"/",vsname,".csv")
  # data_selecto <- data_select[order(data_select$padj, data_select$log2FoldChange, decreasing = c(FALSE, TRUE)), ]#排序
  write.csv(data_select, file = TwoGroupFileName, row.names = TRUE)
  
  #保存为csv--全部基因排序
  TwoGroupFileName_all = paste0(vsname,"/",vsname,"_all.csv")
  write.csv(data, file = TwoGroupFileName_all, row.names = TRUE)
  
  

  
  data$id
  vsname
  title = paste0(vsname,' (CRISPR Results)')
  title
  subtitle = paste0("Differential expression")
  subtitle
  
  # pCutoff = 0.001,## pvalue
  # FCcutoff = 1.333,## FC cutoff
  ##火山图, 里面确定pCut和FCcut
  p_evo <-  EnhancedVolcano(data,
                            lab = data$id,
                            x = "pos.lfc",
                            # y = "padj",
                            y = "pos.score",
                            #selectLab = rownames(lrt)[1:4],
                            xlab = bquote(~Log[2]~ "Fold Change"),
                            ylab = bquote(~-Log[10] ~ italic(p-value)),
                            # pCutoff = 0.001,## pvalue
                            # FCcutoff = 1,## FC cutoff
                            pCutoff = pC,## pvalue
                            FCcutoff = FCC,## FC cutoff
                            # xlim = c(-5,5),
                            # ylim = c(0,7.5),
                            xlim = c(-6,6),
                            ylim = c(0,8),
                            pointSize = 3.5,
                            labSize =3.5,
                            title = title,
                            subtitle = subtitle,
                            caption = 'FC cutoff, 1.333; p-value cutoff, 10e-4',
                            # legendPosition = "right",
                            legendLabSize = 14,
                            # col = c('grey30', 'forestgreen', 'royalblue', 'red2'),
                            colAlpha = 0.8,
                            drawConnectors = TRUE,
                            lengthConnectors = unit(0.05, "inches"),
                            # hline = c(10e-8),
                            widthConnectors = 0.5,
                            max.overlaps = 50,
                            maxoverlapsConnectors=50
  )
  p_evo
  
  # 创建文件夹,两两分析
  dir.create(vsname)
  # 创建RDS文件名
  f_name = paste0(vsname,"/",vsname,".RDS")
  f_name
  
  
  ggsave(paste0(vsname,"/",vsname,"-volcano.png"),p_evo,height = 10,width = 8)
  
  
  
  
  
  
  
  
  
  
  
  
  # yuzhi = 0.01
  # # 定义颜色
  # labelcolor = ifelse(data$pos.score<yuzhi,"red1","black1")
  # # 定义标签
  # data$sign <- ifelse(data$pos.score < yuzhi ,data$id,NA)
  # data$sign
  #geom_text(aes(label = sign), size = 3)
  
  
  # # 加颜色加大小 加标签
  # ggplot(data=data, mapping=aes( x=pos.rank, y=-log(pos.score)))+
  #   geom_point(aes(col=labelcolor,size=-log(pos.score)))+
  #   geom_label_repel(aes(pos.rank,-log(pos.score), 
  #                        label=sign), max.overlaps=100,fontface="bold", color="red", 
  #                    box.padding=unit(0.35, "lines"), point.padding=unit(0.5, "lines"), 
  #                    segment.colour = "grey50")+ theme_classic(base_size = 16)+
  #   scale_color_manual(values=c("black", "red"))+
  #   # ggtitle("MEM2nd_Mock")
  #   # ggtitle("Ctrl_Mock")
  #   # ggtitle("R5_Ctrl")
  #   ggtitle("R5_Mock")
  # 
  # 
  # 
  # labelcolor = ifelse(data$pos.score<0.0001,"red1","black1")
  # # 加颜色家大小(原始冒泡)
  # ggplot(data=data, mapping=aes( x=id, y=-log(pos.score)))+
  #   geom_point(aes(col=labelcolor,size=-log(pos.score)))+
  #   scale_color_manual(values=c("grey", "red"))+
  #   # ggtitle("R5_Ctrl")+
  #   # ggtitle("Top10_Mock")+
  #   ggtitle("R5_Mock")+
  #   geom_label_repel(aes(id,-log(pos.score), 
  #                        label=sign), max.overlaps=100,fontface="bold", color="red", 
  #                    box.padding=unit(0.35, "lines"), point.padding=unit(0.5, "lines"), 
  #                    segment.colour = "grey50")+ theme_classic(base_size = 16)+
  #   theme(axis.text.x = element_blank())
  TwoGroupFileName
  return(TwoGroupFileName)
}

  1. KEGG_GO富集分析绘图代码,在如下文件中
    Func_KEGG_GO_20250520.R
# 加载包
library(clusterProfiler)
library(org.Hs.eg.db)
library(enrichplot)
library(ggplot2)
library(dplyr)

# TwoGroupFileName_all
# id num  neg.score neg.p.value  neg.fdr neg.rank neg.goodsgrna    neg.lfc pos.score pos.p.value  pos.fdr pos.rank pos.goodsgrna    pos.lfc  sig
# 1 AVPR1B   3 0.00017667  0.00062288 0.858213        1             2  -4.013100   0.71203     0.72674 0.993890     1575             1  -4.013100 none
# 2 ABCC10   5 0.00034953  0.00157350 0.858213        2             4  -9.956200   0.86124     0.88344 0.994943     1720             1  -9.956200 none
# 3   CTSK   5 0.00047611  0.00208880 0.858213        3             4  -9.761500   0.28164     0.48870 0.991282      928             1  -9.761500 none
# 4   TSHR   3 0.00052994  0.00170350 0.858213        4             1  -0.039356   0.32336     0.46631 0.990656     1000             1  -0.039356 none
# 5  ANXA5   4 0.00071592  0.00268410 0.858213        5             3 -10.911000   0.94272     0.94320 0.994943     1872             0 -10.911000 none
# 6   CCR1   4 0.00081369  0.00305430 0.858213        6             3 -10.909000   0.25278     0.43110 0.982588      871             1 -10.909000 none


KEGG_GO <- function(file_path){
  # file_path 为全长 FUnc_CRISPR分析来的CSV 所有Up Down基因
  # 处理文件名,后续分析保存文件用
  # 获取文件名,从gene_summary,获取文件对比前缀,后续分析采用仅有up down的CSV
  
  #test
  # file_path <- "R2_Ctrl/R2_Ctrl.csv"
  file_path <- filepath_csv_updown
  
  get_first_part <- function(file_path) {
    # 获取文件名(含扩展名)
    file_name <- basename(file_path)
    # 使用 strsplit 按第一个点分割
    parts <- strsplit(file_name, "\\.", fixed = FALSE)[[1]]
    # 返回第一个部分
    return(parts[1])
  }
  vsname <- get_first_part(file_path)  # 输出: "file"
  vsname
  TwoGroupFileName = paste0(vsname,"/",vsname,".csv")
  TwoGroupFileName
  get_first_part <- function(x) {
    parts <- strsplit(x, "/", fixed = TRUE)[[1]]  # 分割字符串
    return(parts[1])  # 返回第一部分
  }
  go_name <- get_first_part(TwoGroupFileName)
  go_name
  #[1] "R2_Ctrl"
 
  # 读取数据
  gene_list <- read.csv(TwoGroupFileName, header = TRUE, stringsAsFactors = FALSE) %>% dplyr::select(id, sig)
  colnames(gene_list) <- c("gene_symbol", "regulation")
  
  # 基因ID转换
  gene_entrez <- bitr(gene_list$gene_symbol, fromType = "SYMBOL", toType = "ENTREZID", OrgDb = org.Hs.eg.db)
  gene_list <- merge(gene_list, gene_entrez, by.x = "gene_symbol", by.y = "SYMBOL")
  gene_list
  # >   gene_list
  # gene_symbol regulation ENTREZID
  # 1           A2M       none        2
  # 2         AADAC       none       13
  # 3          AATK       none     9625
  # 4          ABAT       none       18
  
  
  up_genes <- gene_list
  # 分离上下调基因
  up_genes <- gene_list$ENTREZID[gene_list$regulation == "up"]
  # down_genes <- gene_list$ENTREZID[gene_list$regulation == "down"]
  # up_genes <- gene_list$ENTREZID
  

  
  
  # KEGG富集分析------------------------------------------------------------------------------------------------------------------------
  # 测试多个阈值
  thresholds <- c(0.01, 0.05, 0.1, 0.2)
  for (thr in thresholds) {
    # KEGG富集分析
    kegg_up <- enrichKEGG(gene = up_genes, organism = "hsa", pAdjustMethod = "BH", qvalueCutoff = thr, pvalueCutoff = thr)
    print(paste("阈值 =", thr, "| 富集通路数 =", nrow(kegg_up)))
    if (nrow(kegg_up) >= 10){
      
      # KEGG条形图---------------------------------------------------------------------
      pKEGG_barplot <- barplot(kegg_up, 
                               showCategory = 10,  # 显示前10个通路
                               font.size = 12,     # 字体大小
                               title = "KEGG Pathway Enrichment") 
      pKEGG_barplot
      
      # 保存KEGG条形图为PDF
      timestamp <- format(Sys.time(), "%Y%m%d_%H%M%S")
      # timestamp
      # [1] "20250527_103734"
      ggsave(paste0(go_name,"/",go_name,"-KEGG_up_barplot-", timestamp,".pdf"), 
             plot = pKEGG_barplot, 
             width = 6, 
             height = 6, 
             dpi = 300)
      
      # KEGG富集网络图-----------------------------------------------------------------
      # 计算通路相似度矩阵(关键步骤)
      kegg_up_sim <- pairwise_termsim(kegg_up)
      # 可视化富集网络
      pKEGG_emap <- emapplot(kegg_up_sim, showCategory = 20) +
        ggtitle("KEGG Pathway Enrichment Map (Upregulated Genes)") +
        theme(plot.title = element_text(hjust = 0.5))  # 标题居中
      pKEGG_emap
      # 保存KEGG富集网络图为PDF
      ggsave(paste0(go_name,"/",go_name,"-KEGG_up_emap-", timestamp,".pdf"), 
             plot = pKEGG_emap, 
             width = 12,
             height = 8,
             dpi = 300)
      
      # KEGG气泡图---------------------------------------------------------------------
      pKEGG_dotplot <- dotplot(kegg_up, showCategory = 10) +
        ggtitle("KEGG Pathway Enrichment (Upregulated Genes)") +
        theme(plot.title = element_text(hjust = 0.5))
      pKEGG_dotplot
      # 保存KEGG气泡图为PDF
      ggsave(paste0(go_name,"/",go_name,"-KEGG_up_dotplot-", timestamp,".pdf"), 
             plot = pKEGG_dotplot, 
             width = 6, 
             height = 6, 
             dpi = 300)
      # 满足10条跳出循环
      break
    }
  }

  
  # GO富集分析------------------------------------------------------------------------------------------------------------------------
  # 可视化----------------------------------------------------------------------------------------------------------------------------
  # barplot(kegg_up, showCategory = 10)
  # dotplot(go_up_BP, showCategory = 10)
  # 总共6张图
  # 1.KEGGbarplot
  # 2.KEGGemap
  # 3.KEGGdotplot
  # 4.GO_BP_dotplot
  # 4.GO_MF_dotplot
  # 4.GO_CC_dotplot
  # 绘制并保存GO气泡图外观 BP,MF,CC
  
  
  # GO_BP------------------------------------------------------------------------
  # 定义要测试的p值阈值
  thresholds <- c(0.01, 0.05, 0.1, 0.2)
  selected_threshold <- NULL  # 存储最终选择的阈值
  # 循环遍历各个阈值
  for (thr in thresholds) {
    # 执行GO富集分析 (BP类别)
    go_up_BP <- enrichGO(
      gene = up_genes,                 # 你的差异表达基因列表
      OrgDb = org.Hs.eg.db,            # 物种注释数据库
      ont = "BP",                      # 只分析生物过程(BP)
      pAdjustMethod = "BH",            # p值校正方法 
      pvalueCutoff = thr,              # 当前测试的p值阈值
      qvalueCutoff = thr               # q值阈值
    )
    # 检查富集结果是否有至少10个通路
    if (!is.null(go_up_BP) && nrow(go_up_BP) >= 10) {
      selected_threshold <- thr
      print(paste("使用阈值", thr, "找到", nrow(go_up_BP), "个富集通路"))
      # GO_BP 绘图
      p1 <- dotplot(go_up_BP, showCategory =  min(10, nrow(go_up_BP))) +
        labs(title = "GO Biological Process Enrichment",
             x = "Gene Ratio",
        ) +
        theme(axis.text.y = element_text(size = 10, face = "bold"),
              legend.title = element_text(size = 12),
              legend.text = element_text(size = 10))
      # p1
      # 保存GO_MF气泡图为PDF
      ggsave(paste0(go_name,"/",go_name,"-GO_BP_dotplot-", timestamp,".pdf"), 
             plot = p1, 
             width = 6, 
             height = 6, 
             dpi = 300)
      break  # 满足条件时跳出循环
    } else {
      print(paste("阈值", thr, "找到", ifelse(is.null(go_up_BP), 0, nrow(go_up_BP)), "个富集通路,继续尝试下一个阈值"))
    }
  }
  
  # 如果所有阈值都不满足条件,则使用最后一个阈值
  if (is.null(selected_threshold)) {
    selected_threshold <- tail(thresholds, 1)
    print(paste("所有阈值均未找到足够的富集通路,使用最大阈值", selected_threshold))
    
    # 再次执行富集分析,确保使用最后一个阈值
    go_up_BP <- enrichGO(
      gene = up_genes,
      OrgDb = org.Hs.eg.db,
      ont = "BP",
      pAdjustMethod = "BH",
      pvalueCutoff = selected_threshold,
      qvalueCutoff = selected_threshold
    )
    # GO_BP 绘图
    p1 <- dotplot(go_up_BP, showCategory = min(10, nrow(go_up_BP))) +
      labs(title = "GO Biological Process Enrichment",
           x = "Gene Ratio",
      ) +
      theme(axis.text.y = element_text(size = 10, face = "bold"),
            legend.title = element_text(size = 12),
            legend.text = element_text(size = 10))
    # p1
    # 保存GO_MF气泡图为PDF
    ggsave(paste0(go_name,"/",go_name,"-GO_BP_dotplot-", timestamp,".pdf"), 
           plot = p1, 
           width = 6, 
           height = 6, 
           dpi = 300)
    break  # 满足条件时跳出循环
  }
  
  
  
  
  
  
  
  

  # GO_MF------------------------------------------------------------------------
  # 定义要测试的p值阈值
  {
  thresholds <- c(0.01, 0.05, 0.1, 0.2)
  selected_threshold <- NULL  # 存储最终选择的阈值
  # 循环遍历各个阈值
  for (thr in thresholds) {
    # 执行GO富集分析 (BP类别)
    go_up_BP <- enrichGO(
      gene = up_genes,                 # 你的差异表达基因列表
      OrgDb = org.Hs.eg.db,            # 物种注释数据库
      ont = "MF",                      # 只分析生物过程(BP)
      pAdjustMethod = "BH",            # p值校正方法
      pvalueCutoff = thr,              # 当前测试的p值阈值
      qvalueCutoff = thr               # q值阈值
    )
    # 检查富集结果是否有至少10个通路
    if (!is.null(go_up_MF) && nrow(go_up_MF) >= 10) {
      selected_threshold <- thr
      print(paste("使用阈值", thr, "找到", nrow(go_up_BP), "个富集通路"))
      # GO_BP 绘图
      p1 <- dotplot(go_up_MF, showCategory =  min(10, nrow(go_up_MF))) +
        labs(title = "GO Molecular Function Enrichment",
             x = "Gene Ratio",
        ) +
        theme(axis.text.y = element_text(size = 10, face = "bold"),
              legend.title = element_text(size = 12),
              legend.text = element_text(size = 10))
      # p1
      # 保存GO_MF气泡图为PDF
      ggsave(paste0(go_name,"/",go_name,"-GO_MF_dotplot-", timestamp,".pdf"), 
             plot = p1, 
             width = 6, 
             height = 6, 
             dpi = 300)
      break  # 满足条件时跳出循环
    } else {
      print(paste("阈值", thr, "找到", ifelse(is.null(go_up_MF), 0, nrow(go_up_MF)), "个富集通路,继续尝试下一个阈值"))
    }
  }
  
  # 如果所有阈值都不满足条件,则使用最后一个阈值
  if (is.null(selected_threshold)) {
    selected_threshold <- tail(thresholds, 1)
    print(paste("所有阈值均未找到足够的富集通路,使用最大阈值", selected_threshold))
    
    # 再次执行富集分析,确保使用最后一个阈值
    go_up_MF <- enrichGO(
      gene = up_genes,
      OrgDb = org.Hs.eg.db,
      ont = "MF",
      pAdjustMethod = "BH",
      pvalueCutoff = selected_threshold,
      qvalueCutoff = selected_threshold
    )
    # GO_MF
    p2 <- dotplot(go_up_MF, 
                  showCategory = 10,       # 显示前15个功能
                  # color = "p.adjust",      # 颜色映射为校正p值
                  # size = "Count"
    ) +        # 点大小映射为基因数量
      labs(title = "GO Molecular Function Enrichment",
           x = "Gene Ratio", # x轴为基因比例
      )+
      theme(axis.text.y = element_text(size = 10, face = "bold"),
            legend.title = element_text(size = 12),
            legend.text = element_text(size = 10))
    p2
    # 保存GO_MF气泡图为PDF
    ggsave(paste0(go_name,"/",go_name,"-GO_MF_dotplot-", timestamp,".pdf"), 
           plot = p2, 
           width = 6, 
           height = 6, 
           dpi = 300)
  }
  }
  
  
  
  
  
  
  
  
  
  
  
 
  # GO_CC------------------------------------------------------------------------
  {
  # 定义要测试的p值阈值
  thresholds <- c(0.01, 0.05, 0.1, 0.2)
  selected_threshold <- NULL  # 存储最终选择的阈值
  # 循环遍历各个阈值
  for (thr in thresholds) {
    # 执行GO富集分析 (BP类别)
    go_up_CC <- enrichGO(
      gene = up_genes,                 # 你的差异表达基因列表
      OrgDb = org.Hs.eg.db,            # 物种注释数据库
      ont = "CC",                      # 只分析生物过程(BP)
      pAdjustMethod = "BH",            # p值校正方法
      pvalueCutoff = thr,              # 当前测试的p值阈值
      qvalueCutoff = thr               # q值阈值
    )
    # 检查富集结果是否有至少10个通路
    if (!is.null(go_up_CC) && nrow(go_up_CC) >= 10) {
      selected_threshold <- thr
      print(paste("使用阈值", thr, "找到", nrow(go_up_CC), "个富集通路"))
      # GO_BP 绘图
      p1 <- dotplot(go_up_CC, showCategory =  min(10, nrow(go_up_CC))) +
        labs(title = "GO Cellular Component Enrichment",
             x = "Gene Ratio",
        ) +
        theme(axis.text.y = element_text(size = 10, face = "bold"),
              legend.title = element_text(size = 12),
              legend.text = element_text(size = 10))
      # p1
      # 保存GO_MF气泡图为PDF
      ggsave(paste0(go_name,"/",go_name,"-GO_CC_dotplot-", timestamp,".pdf"), 
             plot = p1, 
             width = 6, 
             height = 6, 
             dpi = 300)
      break  # 满足条件时跳出循环
    } else {
      print(paste("阈值", thr, "找到", ifelse(is.null(go_up_CC), 0, nrow(go_up_CC)), "个富集通路,继续尝试下一个阈值"))
    }
  }
  
  # 如果所有阈值都不满足条件,则使用最后一个阈值
  if (is.null(selected_threshold)) {
    selected_threshold <- tail(thresholds, 1)
    print(paste("所有阈值均未找到足够的富集通路,使用最大阈值", selected_threshold))
    
    # 再次执行富集分析,确保使用最后一个阈值
    go_up_CC <- enrichGO(
      gene = up_genes,
      OrgDb = org.Hs.eg.db,
      ont = "CC",
      pAdjustMethod = "BH",
      pvalueCutoff = selected_threshold,
      qvalueCutoff = selected_threshold
    )
    # GO_cc
    p3 <- dotplot(go_up_CC, 
                  showCategory = 10,       # 显示前10个功能
                  # color = "p.adjust",      # 颜色映射为校正p值
                  # size = "Count"
    ) +        # 点大小映射为基因数量
      labs(title = "GO Cellular Component Enrichment",
           x = "Gene Ratio", # x轴为基因比例
      )+
      theme(axis.text.y = element_text(size = 10, face = "bold"),
            legend.title = element_text(size = 12),
            legend.text = element_text(size = 10))
    p3
    # 保存GO_CC气泡图为PDF
    ggsave(paste0(go_name,"/",go_name,"-GO_CC_dotplot-", timestamp,".pdf"), 
           plot = p3, 
           width = 6, 
           height = 6, 
           dpi = 300)
  }
 
  }
}
 
  1. 导入前两个R文件,并调用函数代码,在如下文件中
    use_vol_CRISPR-20250519.R
source("./Func_CRISPR_20250519-Huh7-RAG.R")
source("./Func_KEGG_GO_20250520.R")

# 指定CRISPR-MAGEECK分析的Counts 表格数据文件路径
file_path <- "E:/Coding/R_gzlab_docu/CrisprNGS/CRISPR-AQY/20250515/fenxi/Ctrl_Mock.gene_summary.txt"
file_path <- "E:/Coding/R_gzlab_docu/CrisprNGS/CRISPR-AQY/20250515/fenxi/R2_Mock.gene_summary.txt"
file_path <- "E:/Coding/R_gzlab_docu/CrisprNGS/CRISPR-AQY/20250515/fenxi/R5_Mock.gene_summary.txt"
file_path <- "E:/Coding/R_gzlab_docu/CrisprNGS/CRISPR-AQY/20250515/fenxi/R2_Ctrl.gene_summary.txt"
file_path <- "E:/Coding/R_gzlab_docu/CrisprNGS/CRISPR-AQY/20250515/fenxi/R5_Ctrl.gene_summary.txt"
file_path <- "E:/Coding/R_gzlab_docu/CrisprNGS/CRISPR-AQY/20250515/fenxi/R5_R2.gene_summary.txt"


{
  
# Vol_CRISPR <- function(file_path,  pC, FCC)
filepath_csv_updown <- Vol_CRISPR(file_path,0.01,1)
filepath_csv_updown
KEGG_GO(filepath_csv_updown)
}
# filepath_csv_updown <- "R2_Ctrl/R2_Ctrl.csv"
# filepath_csv_updown




# 获取所有以.gene_summary.txt结尾的文件路径
file_paths <- list.files(
  path = "./",           # 目标文件夹
  pattern = "\\.gene_summary\\.txt$",  # 正则表达式匹配
  full.names = TRUE          # 返回全路径
)
for (file_path in file_paths){
  filepath_csv_updown <- Vol_CRISPR(file_path,0.01,1)
  KEGG_GO(filepath_csv_updown)
}

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容