2023-09-27对CUT&Tag数据进行注释

主要参考这篇文章教程:https://www.jianshu.com/p/26aaba19a605

1.合并生物学重复的narrow文件,取交集common peaks

首先分别合并young、old组的H3K27ac和PR测序数据(callpeak完的narrow文档)。具体操作如下:

bedtools intersect \
-a H3K27ac-7_peaks.narrowPeak \
-b H3K27ac-8_peaks.narrowPeak H3K27ac-10_peaks.narrowPeak \
 > /home/data/t170409/data/aging_/callpeak/common_peaks/H3K27ac_old_overlaps.bed
bedtools intersect \
-a PR-1_peaks.narrowPeak \
-b PR-5_peaks.narrowPeak PR-7_peaks.narrowPeak \
 > /home/data/t170409/data/aging_/callpeak/common_peaks/PR_old_overlaps.bed
bedtools intersect \
-a H3K27ac-2_peaks.narrowPeak \
-b H3K27ac-4_peaks.narrowPeak H3K27ac-6_peaks.narrowPeak \
 > /home/data/t170409/data/aging_/callpeak/common_peaks/H3K27ac_young_overlaps.bed
bedtools intersect \
-a PR-3_peaks.narrowPeak \
-b PR-4_peaks.narrowPeak PR-6_peaks.narrowPeak \
 > /home/data/t170409/data/aging_/callpeak/common_peaks/PR_young_overlaps.bed

不要将bed文件取交集,输出peak数量较少。得到以下四个文件:


2.安装R包

使用ChIPseeker需要准备两个文件:一个就是要注释的peaks的文件,需满足BED格式。另一个就是注释参考文件,即需要一个包含注释信息的TxDb对象。

if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("TxDb.Hsapiens.UCSC.hg19.knownGene") #这个包用来构建包含注释信息的TxDb对象
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("ChIPseeker")
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("TxDb.Hsapiens.UCSC.hg19.knownGene")
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("org.Hs.eg.db")

3.将上述4个取common peak的bed文件进行注释。

(1)载入R包,构建txdb文件

library("ChIPseeker")
library("org.Hs.eg.db"
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene

(2)读入peaks文件

函数readPeakFile读入peaks文件

H3K27ac_old_peak <- readPeakFile("H3K27ac_old_overlaps.bed")
H3K27ac_young_peak <- readPeakFile("H3K27ac_young_overlaps.bed")
PR_young_peak <- readPeakFile("PR_young_overlaps.bed")
PR_old_peak <- readPeakFile("PR_old_overlaps.bed")

(3)注释文档

#构建数据文档,并进行注释
H3K27ac_young <- annotatePeak(H3K27ac_young_peak,tssRegion = c(-3000, 3000),TxDb = txdb,annoDb = "org.Hs.eg.db")
H3K27ac_old <- annotatePeak(H3K27ac_old_peak,tssRegion = c(-3000, 3000),TxDb = txdb,annoDb = "org.Hs.eg.db")
PR_young <- annotatePeak(PR_young_peak,tssRegion = c(-3000, 3000),TxDb = txdb,annoDb = "org.Hs.eg.db")
PR_old <- annotatePeak(PR_old_peak,tssRegion = c(-3000, 3000),TxDb = txdb,annoDb = "org.Hs.eg.db")

#输出数据
write.table(as.data.frame(H3K27ac_young),"H3K27ac_young注释.tsv",sep="\t",row.names = F,quote = F)
write.table(as.data.frame(H3K27ac_old),"H3K27ac_old注释.tsv",sep="\t",row.names = F,quote = F)
write.table(as.data.frame(PR_young),"PR_young注释.tsv",sep="\t",row.names = F,quote = F)
write.table(as.data.frame(PR_old),"PR_old注释.tsv",sep="\t",row.names = F,quote = F)

(4)成功注释

4.合并文档,进行比较分析及数据可视化

(1)安装R包

if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("clusterProfiler")
©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

友情链接更多精彩内容