今天上课老师让我们按照PPT上操作顺一遍DNA-seq流程,记录如下(机器懂了人懵了:
##.fastq文件准备
156 cp yeast_1.fastq /bios-analysis8/omics2022_post/cmd
157 cp yeast_2.fastq /bios-analysis8/omics2022_post/cmd
##若人多时,登入节点:
145 ssh c03n01
146 logout #退出节点
##运行fastqc 查看原始数据的质量
163 fastqc yeast_1.fastq
164 fastqc yeast_2.fastq
##运行fastx_trimmer 去除reads末端质量较差的部分 并查看质量
166 fastx_trimmer -Q 33 -l 290 -i yeast_1.fastq -o yeast_trim_1.fastq
167 fastx_trimmer -Q 33 -l 200 -i yeast_2.fastq -o yeast_trim_2.fastq
###-l:last base to keep.default is entire read
168 fastqc yeast_trim_1.fastq
169 fastqc yeast_trim_2.fastq
##Mapping
###由于reads>100bps,所以使用bowtie2
171 bowtie2 -x /bios-store1/yeast_reference/Bowtie2Index_for_gtf/genome -1 yeast_trim_1.fastq -2 yeast_trim_2.fastq -S alignment.sam
Usage:
bowtie2 [options]* -x <bt2-idx> {-1 <m1> -2 <m2> | -U <r> | --interleaved <i>} [-S <sam>]
<bt2-idx> Index filename prefix (minus trailing .X.bt2).
NOTE: Bowtie 1 and Bowtie 2 indexes are not compatible.
<m1> Files with #1 mates, paired with files in <m2>.
Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2).
<m2> Files with #2 mates, paired with files in <m1>.
Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2).
<r> Files with unpaired reads.
Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2).
<i> Files with interleaved paired-end FASTQ reads
Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2).
<sam> File for SAM output (default: stdout)
173 less alignment.sam
##convert the sam file to bam
174 samtools view -Sb alignment.sam -o alignment.bam
##-Sb:input is SAM,output is BAM
##sorted the bam file
176 samtools sort alignment.bam align_sorted
##index the bam file
178 samtools index align_sorted.bam align_sorted.bam.bai
##可视化查看alignment result
201 samtools tview align_sorted.bam /bios-store1/yeast_reference/yeastgenome.fa
##查看alignment分析结果
204 /bios-store1/bin/bam_stat.py -i align_sorted.bam
##测序数据在基因组上的覆盖度分析
207 /bios-store1/bin/bam2wig.py -s /bios-store1/yeast_reference/yeast_genome_size_chr.txt -u -q 20 -i align_sorted.bam -o align
Usage: bam2wig.py [options]
Convert BAM file into wig file. BAM file must be sorted and indexed using SAMtools.
Note: SAM format file is not supported.
-s:chromosome size file;-u:skip non-unique hit reads; -q:Minimum mapping quality to determine "uniquely mapped". default=30;
-o"output file(wig)
208 ls -lh
209 wc -l align.wig
210 more +1 align.wig
211 more +20000 align.wig
##genotyping calling analysis
212 /bios-store1/bin/samtools mpileup -D -q 20 -ugf /bios-store1/yeast_reference/yeastgenome.fa align_sorted.bam | bcftools view -vcg -D100 -> genotyping.vcf
213 less genotyping.vcf
##候选基因功能分析
###vcf文件转为annovar input file
214 /bios-store1/program/annovar/convert2annovar.pl -format vcf4 ./genotyping.vcf -outfile ./genotyping.inp
###候选基因注释
218 /bios-store1/program/annovar/annotate_variation.pl -out ./genotyping -build AT ./genotyping.inp /bios-store1/program/annovar/atdb/