BCFtools可用于处理VCF和BCF文件;具体可参考BCFtools说明文档进行详细学习。
This manual page was last updated 2022-02-21 and refers to bcftools git version 1.15.
键入bcftools可查看所有参数
- annotate .. edit VCF files, add or remove annotations
## 添加ID; 将染色体,pos,REF,ALT添加到ID
bcftools annotate --set-id +'%CHROM\_%POS\_%REF\_%FIRST_ALT' file.vcf
call .. SNP/indel calling (former "view")
cnv .. Copy Number Variation caller
concat .. concatenate VCF/BCF files from the same set of samples
将不同染色体的VCF文件进行合并
consensus .. create consensus sequence by applying VCF variants
convert .. convert VCF/BCF to other formats and back
csq .. haplotype aware consequence caller
filter .. filter VCF/BCF files using fixed thresholds
gtcheck .. check sample concordance, detect sample swaps and contamination
head .. view VCF/BCF file headers
index .. index VCF/BCF
isec .. intersections of VCF/BCF files
merge .. merge VCF/BCF files files from non-overlapping sample sets
mpileup .. multi-way pileup producing genotype likelihoods
norm .. normalize indels
对vcf中的InDel进行对齐
## 检查SV是否和ref一一对应
bcftools norm --check-ref e --fasta-ref Sp_YY_v2.fa $inputVCF
plugin .. run user-defined plugin
polysomy .. detect contaminations and whole-chromosome aberrations
query .. transform VCF/BCF into user-defined formats
## 调取染色体和POS信息。
bcftools query -f '%CHROM\t%POS\n' output.vcf.gz
## 调取基因型是ALT的样本; 如果不在header里面,则用[]
bcftools query -f '%CHROM:%POS [%SAMPLE %GT]\n' -i'GT="alt"' file.bcf
## 调取VCF文件样本名称
bcftools query -l *vcf.gz >samples_list.txt
- reheader .. modify VCF/BCF header, change sample names
bcftools reheader -s test_changed_id -o test.newid.vcf.gz test.vcf.gz
# -s 后面接需要替换的样本名称;共两列,第一列尾old name;第二列对应new name
roh .. identify runs of homo/auto-zygosity
sort .. sort VCF/BCF files
stats .. produce VCF/BCF stats (former vcfcheck)
view .. subset, filter and convert VCF and BCF files
(1)可从VCF中提取指定样本信息
bcftools index *vcf.gz
bcftools view -S samples *vcf.gz -Oz -o *_sampels.vcf.gz
(2)对vcf进行过滤
bcftools view -i 'F_MISSING < 15 & MAC > 3' -m2 -M2 *vcf -Oz -o
*fil.vcf
(3) 指定染色体位置# 格式Chr\tpos
bcftools view -R Chr_pos *vcf.gz -Oz -o *Chr_pos.vcf.gz
(4) 指定位点
bcftools view -T selected_snps.txt output.vcf.gz -o selected_snps.vcf.gz
- intersect
bedtools intersect [OPTIONS] -a <bed/gff/vcf/bam> -b <bed/gff/vcf/bam>
可用于检测两个gff文件中,是否存在🈚️overlap基因,使用-v参数
-v :Only report those entries in A that have _no overlaps_ with B.
- Similar to "grep -v" (an homage).