hello,大家好,随着10X单细胞、10X空间转录组如火如荼的进行中,我们的分析内容和手段也要进入深水区了,很多深入和细节的分析需要我们格外注意了,今天我们来分享两个非常好的点,希望大家能够深入分析自己的数据,发大文章。
首先第一点,Spatial Correlation Analysis,其实这个谈过好几次了,文章在10X空间转录组之共定位分析(细胞类型和配受体基因),10X空间转录组之基因的空间表达模式,10X空间转录组(10X单细胞)之论细胞通讯空间分布的重要性等。这一次我们在文章Multimodal Analysis of Composition and Spatial Architecture in Human Squamous Cell Carcinoma分享一些很经典和值得注意的方法,大家一定要重点关注。
We reasoned that genes expressed in adjacent spots in ST were potentially meaningful and that a simple correlation of genes across spots could overlook this adjacency structure within the data(在ST的相邻斑点中表达的基因具有潜在的意义,并且各个斑点之间的基因简单相关可能会忽略数据中的这种邻接结构,这个地方已经多次强调过,希望引起大家的重视 ). Thus, we calculated average normalized gene expression(均一化的数据) across a ‘‘sliding window’’ of spot groups consisting of a central spot surrounding by its N nearest neighbors(临近spot), where N = 4 in the original ST data and N = 6 in Visium samples for each spot in the tissue, generating a matrix of genes by average spot group expression across all spots
(重点关注,临近spot平均之后产生新的矩阵). This matrix can be correlated with any ‘‘anchoring’’ gene of interest (FOXP3 in our case) by calculating pairwise Pearson correlations of the FOXP3 expression vector across all spots and the gene average group expression vectors across spots(这个地方体现其准备的价值). These values reflect if the expression of a gene in the area surrounding the anchoring gene is correlated with the expression of the anchoring gene and termed ‘‘spatial gene correlation’’ with FOXP3 .(空间基因的相关性)。
关于空间基因的相关性分析,多次的强调过,因为组织有一个有序的“实体”,组织上的细胞类型,基因表达的分布都有其深刻的生物学意义,一定要重点关注。
第二个分析点,cellphoneDB与NicheNet联合进行细胞通讯分析,这个方法相当经典
Ligand-receptor interactions were inferred using a similar approach as previously described (Vento-Tormo et al., 2018)(这个地方就是cellphoneDB的分析结果). We first calculated average expression of ligand and receptor pairs across cell type pairs in normalized scRNA-seq data from an aggregate of the seven patient tumor samples containing TSK cells(老套路). We only considered genes with more than 10% of cells demonstrating expression within each cell type considered. We calculated a null distribution for average ligand-receptor by shuffling cell identities in the aggregated data and re-calculating ligand-receptor average pair expression across 1,000 permutations of randomized cell identities. The P value was the number of randomized pairs exceeding the observed data. For bar plots shown in Figures 6B and 6C, in addition to including only ligand-receptor pairs with p < 0.001, we further thresholded individual ligand or receptor expression with a cutoff of average expression > 0.2 (in log space). The 0.2 cutoff was determined by calculating the average log gene expression distribution for all genes across each cell type, and genes expressed at or above this cutoff corresponded with the top 12% or higher of expressed genes for each cell type.(这个地方就是cellphoneDB的一般流程)。
For NicheNet analysis, we derived TME cell type signatures by taking the top 100 differentially expressed genes in cells isolated from tumors or normal skin, including B cells, endothelial cells, fibroblasts, Langerhans cells, plasmacytoid DCs, CD1C DCs, CLEC9A DCs, T cells, NK cells, macrophages, and MDSCs(熟悉这个软件的同学应该不陌生,需要输入靶基因列表,但是这个靶基因的选择很有讲究,不是简单的cluster之间的差异。)。 We input these signatures into NicheNet to derive a union set of predicted ligands modulating tumor-specific TME cell type signatures(依据靶基因预测配体). For ligands predicting TSK modulation, we input the top 100 TSK-differentially expressed genes . The top 15% of predicted ligands (配体的挑选)by regulatory potential that also demonstrated significance in our scRNA-seq ligand-receptor interaction analysis .we used the FindAllMarkers function in Seurat to generate average logFC values per cell type compared to other cell types from the scRNAseq data.(千万注意)。
For ligand-receptor spatial transcriptomic proximity analysis, the average value of all ligand-receptor pairs across the leading edge from the eight sections from patients 2, 4, and 10 were calculated first by averaging the ligand and receptor expression among each leading edge spot and its 4-6 nearest neighbors (depending on ST technology), and then taking the average values of all of these groups of five or seven spots across the leading edge. This calculation for each ligand-receptor pair was then performed on 1,000 randomized permutations of spot identities while preserving total number of spots per replicate section to generate a null distribution per patient. P value was calculated by number of randomized permutation calculations that exceeded the true average.(边界分析)。
简单总结一下,cellphoneDB分析配受体,依据感兴趣的靶基因,通过NicheNet分析,挑选高活性的配体,然后再从cellphoneDB里面匹配显著的配受体对,从而达到分析目的,说起来很简单,但真正的操作,很需要智慧和能力。
生活很好,等你超越