Chromap output 理解


Chromap will perform all of the following steps automatically for us:

  • Trim adapter sequences at the 3’ end of the reads
  • Set the maximum insert size between Read 1 & Read 2 to be 2000 bp
  • Deduplicate the mapped read pairs at the single-cell level
  • Correct cell barcodes based on provided whitelist
  • Shift read positions to make the start positions (5’) of the reads represent the actual Tn5 binding sites (not important for downstream peak calling, though)
  • Output read pairs (MAPQ >= 30 by default) as the commonly-used BED-like file fragments.tsv. In addition, the SAM format is also supported.

chromap -t 20 -x ref.idx -r ref.fa \
        --preset atac --summary summary.csv \
        -1 mCortex_ATAC_S1_L001_R1_001.fastq.gz \
        -2 mCortex_ATAC_S1_L001_R2_001.fastq.gz \
        -b mCortex_ATAC_S1_L001_I2_001.fastq.gz \
        --barcode-whitelist 737K-cratac-v1_rc.txt \
        -o chromap_outs/fragments.tsv \
        1> chromap.stdout 2> chromap.stderr \
        && bgzip chromap_outs/fragments.tsv

两部分结果

  1. chromap.stderr
  2. summary.csv
image.png
  • Summary 文件行数:从barcode fastq 中识别到的所有barcode
  • Raw reads 数 :475,581,592 (*2= 951,163,184)
  • Summary-total:每个barcode序列的reads数(reads/cell)
  • Number of mapped reads:810,898,762 = 792,439,756 + 18,459,006 ; is consistent with the summary.csv file, which is (total - unmapped) * 2:(475,581,592 - 70,132,211) × 2 = 810,898,762
  • Number of output mappings (passed filters): 161219326 (fragment.tsv 行数)
  • 输出的细胞的fragment条数=total - unmapped - lowmapq - duplicate

Overall alignment (mapping) rate:(total - unmapped)/total
Total unique fragments:total - duplicate - unmapped - lowmapq
Duplication rate (saturation):duplicate/(total - unmapped - lowmapq)

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

友情链接更多精彩内容