-
Understanding The Chromap Summary File
- (Chromap - 简书)preset 模式到atac,基本的处理过程是trim3'端的接头,比对,细胞水平去重、做ATAC的 peak shift,然后根据提供的barcode 白名单进行barcode矫正。
Chromap will perform all of the following steps automatically for us:
- Trim adapter sequences at the 3’ end of the reads
- Set the maximum insert size between Read 1 & Read 2 to be 2000 bp
- Deduplicate the mapped read pairs at the single-cell level
- Correct cell barcodes based on provided whitelist
- Shift read positions to make the start positions (5’) of the reads represent the actual Tn5 binding sites (not important for downstream peak calling, though)
- Output read pairs (MAPQ >= 30 by default) as the commonly-used
BED-like file fragments.tsv. In addition, theSAMformat is also supported.
chromap -t 20 -x ref.idx -r ref.fa \
--preset atac --summary summary.csv \
-1 mCortex_ATAC_S1_L001_R1_001.fastq.gz \
-2 mCortex_ATAC_S1_L001_R2_001.fastq.gz \
-b mCortex_ATAC_S1_L001_I2_001.fastq.gz \
--barcode-whitelist 737K-cratac-v1_rc.txt \
-o chromap_outs/fragments.tsv \
1> chromap.stdout 2> chromap.stderr \
&& bgzip chromap_outs/fragments.tsv
两部分结果
chromap.stderrsummary.csv

image.png
- Summary 文件行数:从barcode fastq 中识别到的所有barcode
- Raw reads 数 :
475,581,592 (*2= 951,163,184) - Summary-total:每个barcode序列的reads数(reads/cell)
- Number of mapped reads:
810,898,762 = 792,439,756 + 18,459,006; is consistent with the summary.csv file, which is (total - unmapped) * 2:(475,581,592 - 70,132,211) × 2 = 810,898,762 - Number of output mappings (passed filters): 161219326 (fragment.tsv 行数)
- 输出的细胞的fragment条数=total - unmapped - lowmapq - duplicate
Overall alignment (mapping) rate:
(total - unmapped)/total
Total unique fragments:total - duplicate - unmapped - lowmapq
Duplication rate (saturation):duplicate/(total - unmapped - lowmapq)
