作者,Evil Genius
最近有一位学员问,为什么公司更新升级流程需要这么多的内容?
原因很简单,所谓流程升级,需要调研,把所有的分析方法、算法、使用场景、优劣势全部总结出来,比如现在的空间平台,有Visium、Xenium、HD、Stereo-seq,再比如单细胞空间联合分析的方法,几十种,针对不同的平台就需要不同的分析策略。
一言以蔽之,需要资源整合。
当然了,大家自己的课题只需要多研究自己的方向即可,有时间就提升自己,我读研没有这种意识,水着就毕业了,现在后悔死了,和那些有高分文章发表的同学,差距越来越大了,失之毫厘谬以千里绝对不是危言耸听。希望大家引以为戒,读研读博不要浪费时间,这是发文章最黄金的时间。
现在大家做HD、Stereo-seq,尽量不要再用bin分割模式,采用图像识别的图像分割模式,获取单细胞级别的空间矩阵。
今天我们继续升级分析流程,华大(Stereo-seq)为主。
以细胞组成为基础解剖细胞邻域(CN)组织的iTME
大多数用于细胞-细胞通讯分析的开源软件都是基于基因表达或细胞距离设计的,而没有考虑细胞-细胞之间的相互作用距离。
揭示异质性空间TME引起的细胞间通信的深度空间共变。
分析框架
1、首先进行细胞邻域分析
2、分析细胞与细胞之间的空间相互作用,通过定量定义细胞与基因的空间接近程度和相互作用强度来检测活性的L-R对。
3、构建跨多个样本的TME meta模块,同时,分析这些TME模块和meta模块相关联的空间交互模块。
将iTME解码为细胞邻域(CN)组织单元
空间细胞相互作用强度:推断空间细胞间的通信
利用空间转录组学分析肿瘤微环境
组织TME相关的细胞邻域
针对单个神经网络对iTME进行反卷积
示例代码
git clone https://github.com/STOmics/SCIITensor.git
cd SCIITensor
python setup.py install
Single sample analysis
import SCIITensor as sct
import scanpy as sc
import pandas as pd
import seaborn as sns
import matplotlib as mpl
import matplotlib.pyplot as plt
import pickle
adata = sc.read("/data/work/LR_TME/Liver/LC5M/sp.h5ad")
lc5m = sct.core.scii_tensor.InteractionTensor(adata, interactionDB="/data/work/database/LR/cellphoneDB_interactions_add_SAA1.csv")
sct.core.scii_tensor.build_SCII(lc5m)
sct.core.scii_tensor.process_SCII(lc5m, bin_zero_remove=True, log_data=True)
sct.core.scii_tensor.eval_SCII_rank(lc5m)
sct.core.scii_tensor.SCII_Tensor(lc5m)
with open("LC5M_res.pkl", "wb") as f:
pickle.dump(lc5m, f)
# Visualization
## heatmap
sct.core.scii_tensor.plot_tme_mean_intensity(lc5m, tme_module = 0, cellpair_module = 2, lrpair_module = 4,
n_lr = 15, n_cc = 5,
figsize = (10, 2), save = False, size = 2, vmax=1)
factor_cc = lc5m.cc_factor.copy()
factor_cc.columns = factor_cc.columns.map(lambda x: f"CC_Module {x}")
factor_lr = lc5m.lr_factor.copy()
factor_lr.columns = factor_lr.columns.map(lambda x: f"LR_Module {x}")
factor_tme = pd.DataFrame(lc5m.factors[2])
factor_tme.columns = factor_tme.columns.map(lambda x: f"TME {x}")
#draw the heatmap based on the cell-cell factor matrix
fig = sns.clustermap(factor_cc.T, cmap="Purples", standard_scale=0, metric='euclidean', method='ward',
row_cluster=False, dendrogram_ratio=0.05, cbar_pos=(1.02, 0.6, 0.01, 0.3),
figsize=(24, 10),
)
fig.savefig("./factor_cc_heatmap.pdf")
#select the top ligand-receptor pairs, then draw the heatmap based on ligan-receptor factor matrix
lr_number = 120 #number of ligand-receptor pairs on the top that will remain
factor_lr_top = factor_lr.loc[factor_lr.sum(axis=1).sort_values(ascending=False).index[0:lr_number]]
fig = sns.clustermap(factor_lr_top.T, cmap="Purples", standard_scale=0, metric='euclidean', method='ward',
row_cluster=False, dendrogram_ratio=0.05, cbar_pos=(1.02, 0.6, 0.01, 0.3),
figsize=(28, 10),
)
fig.savefig("./factor_lr_heatmap.pdf")
## sankey
core_df = sct.plot.sankey.core_process(lc5m.core)
sct.plot.sankey.sankey_3d(core_df, link_alpha=0.5, interval=0.001, save="sankey_3d.pdf")
## circles
interaction_matrix = sct.plot.scii_circos.interaction_select(lc5m.lr_mt_list, factor_cc, factor_lr, factor_tme,
interest_TME='TME 0',
interest_cc_module='CC_Module 3',
interest_LR_module='LR_Module 4',
lr_number=20,
cc_number=10)
plt.figure(figsize=(8, 3))
sns.heatmap(interaction_matrix, vmax=1)
#Draw the circos diagram, which includes cell types, ligand-receptor genes, and the links between ligands and receptors.
cells = ['Hepatocyte', 'Fibroblast', 'Cholangiocyte', 'Endothelial', 'Macrophage', 'Malignant', 'B_cell', 'T_cell', 'DC', 'NK'] #list contains names of all cell types
sct.plot.scii_circos.cells_lr_circos(interaction_matrix, cells, save="cells_lr_circos.pdf")
#Draw the circos which only contains cell types and the links between them.
sct.plot.scii_circos.cells_circos(interaction_matrix, cells, save="cells_circos.pdf")
#Draw circos which only contains ligand-receptor genes
sct.plot.scii_circos.lr_circos(interaction_matrix, cells)
## igraph
sct.plot.scii_net.grap_plot(interaction_matrix, cells,
save="igrap_network.pdf")
cc_df = sankey.factor_process(lc5m.factors[0], lc5m.cellpair)
sct.plot.sankey.sankey_2d(cc_df)
Multiple sample analysis
adata_LC5P = sc.read("/data/work/LR_TME/Liver/LC5P/FE1/cell2location_map/sp.h5ad")
lc5p = sct.core.scii_tensor.InteractionTensor(adata_LC5P, interactionDB="/data/work/database/LR/cellphoneDB_interactions_add_SAA1.csv")
sct.core.scii_tensor.build_SCII(lc5p)
sct.core.scii_tensor.process_SCII(lc5p)
sct.core.scii_tensor.eval_SCII_rank(lc5p)
sct.core.scii_tensor.SCII_Tensor(lc5p)
with open('LC5P_res.pkl', "wb") as f:
pickle.dump(lc5p, f)
adata_LC5T = sc.read("/data/work/LR_TME/Liver/LC5T/FD3/cell2location_map/sp.h5ad")
lc5t = sct.core.scii_tensor.InteractionTensor(adata_LC5T, interactionDB="/data/work/database/LR/cellphoneDB_interactions_add_SAA1.csv")
sct.core.scii_tensor.build_SCII(lc5t)
sct.core.scii_tensor.process_SCII(lc5t)
sct.core.scii_tensor.eval_SCII_rank(lc5t)
sct.core.scii_tensor.SCII_Tensor(lc5t)
with open('LC5T_res.pkl', "wb") as f:
pickle.dump(lc5t, f)
## merge data
all_data = sct.core.scii_tensor.merge_data([lc5t, lc5m, lc5p], patient_id = ['LC5T', 'LC5M', 'LC5P'])
sct.core.scii_tensor.SCII_Tensor_multiple(all_data)
## heatmap
mpl.rcParams.update(mpl.rcParamsDefault)
sct.core.scii_tensor.plot_tme_mean_intensity_multiple(all_data, sample='LC5T',
tme_module=0, cellpair_module=0, lrpair_module=0, vmax=1)