ChIP-seq数据分析实战训练（四）

homer软件来寻找motif

下载数据库

cd /public/workspace/fangwen/learn/chip-seq/biosoft/
mkdir homer &&  cd homer
wget http://homer.salk.edu/homer/configureHomer.pl 
perl configureHomer.pl -install
perl configureHomer.pl -install mm10

运行homer软件

homer软件找motif整合了两个方法，包括依赖于数据库的查询，和de novo的推断,都是读取ChIP-seq数据上游分析得到的bed格式的peaks文件。
但是使用起来很简单：http://homer.ucsd.edu/homer/ngs/peakMotifs.html

cd  /public/workspace/fangwen/learn/chip-seq/motif/
for id in /public/workspace/fangwen/learn/chip-seq/peaks/*.bed;
do
echo $id
file=$(basename $id )
sample=${file%%.*} 
echo $sample  
awk '{print $4"\t"$1"\t"$2"\t"$3"\t+"}' $id >homer_peaks.tmp  
findMotifsGenome.pl homer_peaks.tmp mm10 ${sample}_motifDir -len 8,10,12
annotatePeaks.pl    homer_peaks.tmp mm10  1>${sample}.peakAnn.xls 2>${sample}.annLog.txt 
done

把上面的代码保存为脚本runMotif.sh，然后运行：nohup bash runMotif.sh 1>motif.log &
不仅仅找了motif，还顺便把peaks注释了一下。得到的后缀为peakAnn.xls 的文件就可以看到和使用R包注释的结果是差不多的。
还可以使用meme来找motif，需要通过bed格式的peaks的坐标来获取fasta序列。MEME，链接：http://meme-suite.org/

其它高级分析

比如可以比较不同的peaks文件，代码见：https://github.com/jmzeng1314/NGS-pipeline/blob/master/CHIPseq/step6-ChIPpeakAnno-Venn.R
本教程讲解的是单端测序数据的处理，如果是双端测序，里面的很多参数是需要修改的。

cd  /public/workspace/fangwen/learn/chip-seq/motif/
for id in /public/workspace/fangwen/learn/chip-seq/peaks/*.bed;
do
echo $id
file=$(basename $id )
sample=${file%%.*} 
echo $sample  
awk '{print $4"\t"$1"\t"$2"\t"$3"\t+"}' $id >homer_peaks.tmp  
findMotifsGenome.pl homer_peaks.tmp /public/workspace/fangwen/learn/chip-seq/biosoft/homer/data/genomes/
mm10/mm10 ${sample}_motifDir -len 8,10,12
annotatePeaks.pl    homer_peaks.tmp /public/workspace/fangwen/learn/chip-seq/biosoft/homer/data/genomes/
mm10  1>${sample}.peakAnn.xls 2>${sample}.annLog.txt 
done

最后编辑于：2021.04.22 10:26:56

©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成，浏览时请结合常识与多方信息审慎甄别。
平台声明：文章内容（如有图片或视频亦包括在内）由作者上传并发布，文章内容仅代表作者本人观点，简书系信息发布平台，仅提供信息存储服务。

ChIP-seq数据分析实战训练（四）