DNA-seq 流程

今天上课老师让我们按照PPT上操作顺一遍DNA-seq流程，记录如下（机器懂了人懵了：

##.fastq文件准备

156 cp yeast_1.fastq /bios-analysis8/omics2022_post/cmd

157 cp yeast_2.fastq /bios-analysis8/omics2022_post/cmd

##若人多时，登入节点：

145 ssh c03n01

146 logout #退出节点

##运行fastqc 查看原始数据的质量

163 fastqc yeast_1.fastq

164 fastqc yeast_2.fastq

##运行fastx_trimmer 去除reads末端质量较差的部分并查看质量

166 fastx_trimmer -Q 33 -l 290 -i yeast_1.fastq -o yeast_trim_1.fastq

167 fastx_trimmer -Q 33 -l 200 -i yeast_2.fastq -o yeast_trim_2.fastq

###-l:last base to keep.default is entire read

168 fastqc yeast_trim_1.fastq

169 fastqc yeast_trim_2.fastq

##Mapping

###由于reads>100bps，所以使用bowtie2

171 bowtie2 -x /bios-store1/yeast_reference/Bowtie2Index_for_gtf/genome -1 yeast_trim_1.fastq -2 yeast_trim_2.fastq -S alignment.sam

Usage:

bowtie2 [options]* -x <bt2-idx> {-1 <m1> -2 <m2> | -U <r> | --interleaved <i>} [-S <sam>]

<bt2-idx> Index filename prefix (minus trailing .X.bt2).

NOTE: Bowtie 1 and Bowtie 2 indexes are not compatible.

<m1> Files with #1 mates, paired with files in <m2>.

Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2).

<m2> Files with #2 mates, paired with files in <m1>.

Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2).

<r> Files with unpaired reads.

Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2).

<i> Files with interleaved paired-end FASTQ reads

Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2).

<sam> File for SAM output (default: stdout)

173 less alignment.sam

##convert the sam file to bam

174 samtools view -Sb alignment.sam -o alignment.bam

##-Sb:input is SAM,output is BAM

##sorted the bam file

176 samtools sort alignment.bam align_sorted

##index the bam file

178 samtools index align_sorted.bam align_sorted.bam.bai

##可视化查看alignment result

201 samtools tview align_sorted.bam /bios-store1/yeast_reference/yeastgenome.fa

##查看alignment分析结果

204 /bios-store1/bin/bam_stat.py -i align_sorted.bam

##测序数据在基因组上的覆盖度分析

207 /bios-store1/bin/bam2wig.py -s /bios-store1/yeast_reference/yeast_genome_size_chr.txt -u -q 20 -i align_sorted.bam -o align

Usage: bam2wig.py [options]

Convert BAM file into wig file. BAM file must be sorted and indexed using SAMtools.

Note: SAM format file is not supported.

-s:chromosome size file;-u:skip non-unique hit reads; -q:Minimum mapping quality to determine "uniquely mapped". default=30;

-o"output file(wig)

208 ls -lh

209 wc -l align.wig

210 more +1 align.wig

211 more +20000 align.wig

##genotyping calling analysis

212 /bios-store1/bin/samtools mpileup -D -q 20 -ugf /bios-store1/yeast_reference/yeastgenome.fa align_sorted.bam | bcftools view -vcg -D100 -> genotyping.vcf

213 less genotyping.vcf

##候选基因功能分析

###vcf文件转为annovar input file

214 /bios-store1/program/annovar/convert2annovar.pl -format vcf4 ./genotyping.vcf -outfile ./genotyping.inp

###候选基因注释

218 /bios-store1/program/annovar/annotate_variation.pl -out ./genotyping -build AT ./genotyping.inp /bios-store1/program/annovar/atdb/

最后获得所有文件

最后编辑于：2022.11.03 13:38:33

DNA-seq 流程

推荐阅读更多精彩内容