CCS: Generate Highly Accurate Single-Molecule Consensus Reads
Latest ccs can be installed via bioconda package pbccs.
官网 https://github.com/PacificBiosciences/ccs
#安装
conda install -c bioconda pbccs
Input: Subreads from a single movie in PacBio BAM format (.subreads.bam).
Output: Consensus reads in a format inferred from the file extension: unaligned BAM (.bam); FASTQ (.fastq); or SMRT Link XML (.consensusreadset.xml) which also generates a corresponding BAM file.
用以下代码跑
Run on a full movie:
ccs movie.subreads.bam movie.ccs.bam --noPolish --minPasses 1
Usage: ccs [options] INPUT OUTPUT
Generate circular consensus sequences (ccs) from subreads.
Basic Options:
-h,--help Output this help.
--version Output version information.
--logFile Log to a file, instead of stderr.
--log-level,--logLevel Set log level: "TRACE", "DEBUG", "INFO", "WARN",
"FATAL". ["WARN"]
-j,--numThreads Number of threads to use, 0 means autodetection.
[0]
Input Filter Options:
--minLength Minimum length of subreads to use for generating
CCS. [10] 默认
--maxLength Maximum length of subreads to use for generating
CCS. [21000] 也许可以设置长一些,得到一些意想不到的转录本
--minPasses Minimum number of subreads required to generate
CCS. [3] 最低至少有3个subreads,就是转3圈之后,认为可以生成可信的ccs
--minIdentity Minimum identity of a subread aligned to the draft
consensus to use it for polishing. 0 disables this
filter. [0.82] 默认了,不知道做什么
--minSnr Minimum SNR of input subreads. [2.5]
--zmws Generate CCS for the provided comma-separated
holenumber ranges only. Default = all
Model Override Options: 模型?不知道什么意思
--modelPath Path to a model file or directory containing model files.
--modelSpec Name of chemistry or model to use, overriding default selection.
Processing Options:
--byStrand Generate a consensus for each strand.
--noPolish Only output the initial template derived from the
POA (**faster, less accurate**). 不修正序列?
--richQVs Emit dq, iq, and sq "rich" quality tracks.
Output Filter Options:
--minPredictedAccuracy Minimum predicted accuracy in [0, 1]. [0.9]
--minReadScore Minimum read score of input subreads. [0.75]
--maxDropFraction Maximum fraction of subreads dropped by polishing
(not input filters) before skipping ZMW. [0.34]
Output Files Options:
--force Overwrite OUTPUT file if present. 强制覆盖并重写文件
--reportFile Where to write the results report. report文件路径
["ccs_report.txt"]
Options:
--emit-tool-contract Emit tool contract.
--resolved-tool-contract Use args from resolved tool contract.
Arguments:
input Input file.
output Output file.
影响CCS reads数目和质量的因素:
The longer the polymerase read gets, more readouts (passes) of the SMRTbell are produced and consequently more evidence is accumulated per molecule. This increase in evidence translates into higher consensus accuracy, as depicted in the following sketch:
运行时间:
20190728: 0点开始 - 结束 100M转录组数据