方法简述
- 将推定恶性细胞比例调整至20%以下
- 过滤掉信息量较少的基因(默认:少于 10 个细胞表达、在 log2 尺度上的平均表达少于 0.1)
- 转化为Z-score,将尺度限制为-3~3
- 按染色体位置对基因进行排序,估计 CNV 信号(默认 = 100 个基因)
- 用两个参数汇总CNV信号,对恶性细胞和非恶性细胞进行分类。
- CNV 信号(MS,平方均值):CNV信号估计均方。
- CORR(与高 CNV 信号细胞的相关性):每个细胞的 CNV 与前 5% 细胞的平均值的相关性。
- 如果它们的 CNV 信号 (MS) > 0.02(X轴) 或 CNV 相关性 (CORR) > 0.2(Y轴),则鉴定为恶性细胞。
代码可通过原作者Github获取。这里以作者给出的example进行示例。
library(ggplot2)
library(pals)
library(plyr)
library(dplyr)
library(Seurat)
library(gplots)
library(RColorBrewer)
####################################################################################################
source("R/calInferredCNA_for_CEP.R") ## calculate inferredCNV value
source("R/makingTCIDEA_for_CEP.R") ## make TCIDEA object
source("R/TCIDEA_obj_for_CEP.R") ## initiate TCIDEA object
source("R/calCNVScore_for_CEP.R") ## calculate CNV score (MS, Corr) with CEP result
###################################################################################################
## Example data ###################################################################################
#肿瘤样本注释
cell_annotation_with_tumor <- readRDS(file = "example/cell_info_tumor_example.Rds") # celltype = annotation cell types in transcriptome data
#肿瘤样本log2表达矩阵
tumor_example <- readRDS(file = "example/log2TPM_tumor_example.Rds")
#正常细胞log2表达矩阵
normal_example <- readRDS(file = "example/log2TPM_normal_example.Rds")
#基因组位置信息
ref_genome_example <- readRDS(file = "example/refgenome_example.Rds")
#输出目录
output.dir = paste0(getwd(), "/", "result")
###################################################################################################
## PARAMETERS #####################################################################################
#设定推断恶性细胞最大比例
EP_cutoff = 20 ## count if > 20% of EP -> add
#ep细胞亚群的名称
target.celltypes = "EP" ## declare name of epithelial cells in metadata
label = "example"
###################################################################################################
## 1. check proportion of epithelial cells in tumor tissues ##
prop <- as.data.frame(table(cell_annotation_with_tumor$celltype))
prop$Percent = prop$Freq / nrow(cell_annotation_with_tumor) * 100
##2. Check the proportion (adding normal cells or not)
if(prop[prop$Var1 %in% target.celltypes,]$Percent > EP_cutoff){
list <- addNormalDataset(tumor.data = tumor_example, tumor.ident = cell_annotation_with_tumor, target.celltypes = target.celltypes,
normal.data = normal_example)
addnormal_example <- as.matrix(list$data); addnormal_annotation <- list$ident
runCEP(target.normalized = addnormal_example,
sample.info = addnormal_annotation, label = paste0(label,"_AddNormal"),
annotationdata = ref_genome_example, target.celltypes = target.celltypes, output.dir = output.dir,
min.cells = 10, MYwalk = 100) ## Sample list of EP proportion > EP_cutoff (20%)
}else{
runCEP(target.normalized = tumor_example,
sample.info = cell_annotation_with_tumor, label = label,
annotationdata = ref_genome_example,target.celltypes = target.celltypes, output.dir = output.dir,
min.cells = 10, MYwalk = 100) ## Sample list of EP proportion <= EP_cutoff (20%)
}
############################################
结果提取
result=readRDS(file = "result/example_AddNormal_after_calc_CNV_score.Rds")
identification=data.frame(Barcord=result$Index,cell_type=result$cell_index)
head(identification)
后续自行整合至seurat注释进行后续分析
参考来源:
https://github.com/SGI-LungCancer/SingleCell
参考文献:
Single-cell RNA sequencing demonstrates the molecular and cellular reprogramming of metastatic lung adenocarcinoma
鸣谢:
I thank Dr.Jianming Zeng(University of Macau), and all the members of his bioinformatics team, biotrainee, for generously sharing their experience and codes.
问题交流:
Email: xuran@hrbmu.edu.cn