T细胞(TCR)克隆多样性的指标(CPK)
一、CPK是什么
CPK就是一个衡量,T细胞(TCR)克隆多样性的指标[Ref1]。它越高就说明文库大小标准化后的CDR3数目越多,也就是T细胞克隆多样性越强。
Ref1:是这样描述的We therefore used the number of unique CDR3 calls in each sample normalized by the total read count in the TCR region, which we call clonotypes per thousand (kilo) reads (CPK), as a measure of clonotype diversity (Online Methods).
二、为什么要用CPK
在低的reads覆盖度,CDR3的数目近似与reads数成线性增加的关系,只要reads的覆盖度达不到饱和(一般是超过100 million reads),distinct CDR3的数目和reads的数目比例就不会依赖于序列的文库大小[Ref1]。所以在多样性指标计算中,需要校正掉文库大小这个因素。
Ref1: 是这样描述的
At low read coverage, the number of distinct CDR3 sequences increases approximately linearly with read count.
This result suggests that the ratio of distinct CDR3 sequence calls over the number of reads is independent of sequence library size, as long as the read count does not reach the saturation level, which is typically larger than 100 million reads.
三、什么时候需要用CPK
需要计算你CDR3的数目是否达到饱和状态,如果你的数目和文库大小有强的线性相关关系,那就说明该覆盖度下的reads未达到饱和水平[Ref1],此时就需要校正掉文库大小,即需要使用CPK。
(Ref1)Comparing to our observation in Figure 4a, we concluded that the read counts we obtained from RNA-seq data were far from saturation and stayed in the linear phase.
四、怎么计算CPK(clonotypes per thousand (kilo) reads (CPK))
CPK= (the number of unique CDR3)*1000/( the total read count in the TCR region) [Ref1]
Ref1:是这样描述的We therefore used the number of unique CDR3 calls in each sample normalized by the total read count in the TCR region, which we call clonotypes per thousand (kilo) reads (CPK), as a measure of clonotype diversity (Online Methods).
附:
Ref1:Landscape of tumor-infiltrating T cell repertoire of human cancers,Nat genetics,2016,27