https://www.bioinfo-scrounger.com/archives/111/
DESEQ往往作为寻找差异基因的工具而被我们所熟知,但是在寻找差异盛行的年代,也可以用Deseq寻找差异contact。
不信?来看一下:
文章一:Alterations of specific chromatin conformation affect ATRA-induced leukemia cell differentiation
文章二:Highly rearranged chromosomes reveal uncoupling between genome topology and gene expression
那么如果我们想用bin pairs 做差异contact分析,该如何做呢?
首先以bin pair 为基本单位准备多个文库(2个左右)的raw_count_table,为每个文库准备好组别信息的condition文件。最终按照以下操作就大功告成啦~~
library(DESeq)
library(gplots)
## read raw count table
#name arep1 arep2 brep1 brep2
#bin_pair1 500 500 400 700
#............................................
data=read.table(raw_count_table,header=T,row.names=1)
##read condition file content like that
#sample condition
#arep1 group_A
#arep2 group_A
#brep1 group_B
#brep2 group_B
design=read.table(conditions,header=T,row.names=1)
#condition1=group_A
#condition2=group_B
cmp = c(condition1,condition2)
dd=which(design$condition==cmp[1] | design$condition==cmp[2])
new_data = data[,dd]
#---------'generate DEseq object'--------
cds=newCountDataSet(new_data,design[dd,1])
print('estimate Size Factors')
cds=estimateSizeFactors(cds)
print('estimate Dispersions')
cds=estimateDispersions(cds,method='per-condition',fitType="local")
res=nbinomTest(cds,cmp[1],cmp[2])
result = array('no',dim=c(length(res$padj),1))
result[which(res$padj < 0.05)] = 'yes'
total =cbind(res,counts(cds,normalized =T),result)
write.table(total,file=out_result,sep="\t", quote = FALSE, row.names = FALSE)
通过上面呢,我觉得只要模型对,方法都是通用的。