10X单细胞空间回顾之stlearn

今天我们还有复习一下stlearn，很棒的软件

stSME clustering tutorial(空间平滑)

stSME is a novel normalisation method implemented in stLearn software.

It’s designed for spatial transcriptomics data and utilised tissue Spatial location, Morphology, , and gene Expression.

This tutorial demonstrates how to use stLearn to perform stSME clustering for spatial transcriptomics data

Mouse Brain (Coronal)

1. Preparation

# import module
import stlearn as st
from pathlib import Path
st.settings.set_figure_params(dpi=180)

# specify PATH to data
BASE_PATH = Path("/home/uqysun19/60days/10x_visium/mouse_brain_coronal")

# spot tile is the intermediate result of image pre-processing
TILE_PATH = Path("/tmp/tiles")
TILE_PATH.mkdir(parents=True, exist_ok=True)

# output path
OUT_PATH = Path("/home/uqysun19/60days/stlearn_plot/mouse_brain_coronl")
OUT_PATH.mkdir(parents=True, exist_ok=True)

# load data
data = st.Read10X(BASE_PATH)

# pre-processing for gene count table
st.pp.filter_genes(data,min_cells=1)
st.pp.normalize_total(data)
st.pp.log1p(data)

# pre-processing for spot image
st.pp.tiling(data, TILE_PATH)

# this step uses deep learning model to extract high-level features from tile images
# may need few minutes to be completed
st.pp.extract_feature(data)

2. run stSME clustering

# run PCA for gene expression data
st.em.run_pca(data,n_comps=50)

data_SME = data.copy()
# apply stSME to normalise log transformed data
st.spatial.SME.SME_normalize(data_SME, use_data="raw")
data_SME.X = data_SME.obsm['raw_SME_normalized']
st.pp.scale(data_SME)
st.em.run_pca(data_SME,n_comps=50)

# K-means clustering on stSME normalised PCA
st.tl.clustering.kmeans(data_SME,n_clusters=19, use_data="X_pca", key_added="X_pca_kmeans")
st.pl.cluster_plot(data_SME, use_label="X_pca_kmeans")

../_images/tutorials_stSME_clustering_14_1.png

# louvain clustering on stSME normalised data
st.pp.neighbors(data_SME,n_neighbors=17,use_rep='X_pca')
st.tl.clustering.louvain(data_SME, resolution=1.19)
st.pl.cluster_plot(data_SME,use_label="louvain")

../_images/tutorials_stSME_clustering_15_1.png

we now move to Mouse Brain (Sagittal Posterior) Visium dataset from 10x genomics website.

Mouse Brain (Sagittal Posterior)

1. Preparation

# specify PATH to data
BASE_PATH = Path("/home/uqysun19/60days/10x_visium/mouse_brain_s_p_1/")

# spot tile is the intermediate result of image pre-processing
TILE_PATH = Path("/tmp/tiles")
TILE_PATH.mkdir(parents=True, exist_ok=True)

# outpot path
OUT_PATH = Path("/home/uqysun19/60days/stlearn_plot/mouse_brain_s_p_1/")
OUT_PATH.mkdir(parents=True, exist_ok=True)

# load data
data = st.Read10X(BASE_PATH)
# pre-processing for gene count table
st.pp.filter_genes(data,min_cells=1)
st.pp.normalize_total(data)
st.pp.log1p(data)
st.pp.scale(data)

# pre-processing for spot image
st.pp.tiling(data, TILE_PATH)

# this step uses deep learning model to extract high-level features from tile images
# may need few minutes to be completed
st.pp.extract_feature(data)

2. run stSME clustering

# run PCA for gene expression data
st.em.run_pca(data,n_comps=50)

data_SME = data.copy()
# apply stSME to normalise log transformed data
# with weights from morphological Similarly and physcial distance
st.spatial.SME.SME_normalize(data_SME, use_data="raw",
                             weights="weights_matrix_pd_md")
data_SME.X = data_SME.obsm['raw_SME_normalized']
st.pp.scale(data_SME)
st.em.run_pca(data_SME,n_comps=50)

# K-means clustering on stSME normalised PCA
st.tl.clustering.kmeans(data_SME,n_clusters=17, use_data="X_pca", key_added="X_pca_kmeans")
st.pl.cluster_plot(data_SME, use_label="X_pca_kmeans")

../_images/tutorials_stSME_clustering_26_1.png

# louvain clustering on stSME normalised data
st.pp.neighbors(data_SME,n_neighbors=20,use_rep='X_pca')
st.tl.clustering.louvain(data_SME)
st.pl.cluster_plot(data_SME,use_label="louvain")

../_images/tutorials_stSME_clustering_27_1.png

Then we apply stSME clustering on Human Brain dorsolateral prefrontal cortex (DLPFC) Visium dataset from this paper.

Human Brain dorsolateral prefrontal cortex (DLPFC)

import pandas as pd
from sklearn.metrics import adjusted_rand_score, normalized_mutual_info_score, silhouette_score, \
                            homogeneity_completeness_v_measure
from sklearn.metrics.cluster import contingency_matrix
from sklearn.preprocessing import LabelEncoder
import numpy as np
import scanpy

def purity_score(y_true, y_pred):
    # compute contingency matrix (also called confusion matrix)
    cm = contingency_matrix(y_true, y_pred)
    # return purity
    return np.sum(np.amax(cm, axis=0)) / np.sum(cm)

# specify PATH to data
BASE_PATH = Path("/home/uqysun19/60days/Human_Brain_spatialLIBD")

# here we include all 12 samples
sample_list = ["151507", "151508", "151509",
               "151510", "151669", "151670",
               "151671", "151672", "151673",
               "151674", "151675", "151676"]

Ground truth

for i in range(len(sample_list)):
    sample = sample_list[i]

    GROUND_TRUTH_PATH = BASE_PATH / sample / "cluster_labels_{}.csv".format(sample)
    ground_truth_df = pd.read_csv(GROUND_TRUTH_PATH, sep=',', index_col=0)
    ground_truth_df.index = ground_truth_df.index.map(lambda x: x[7:])

    le = LabelEncoder()
    ground_truth_le = le.fit_transform(list(ground_truth_df["ground_truth"].values))
    ground_truth_df["ground_truth_le"] = ground_truth_le

    # load data
    data = st.Read10X(BASE_PATH / sample)
    ground_truth_df = ground_truth_df.reindex(data.obs_names)
    data.obs["ground_truth"] = pd.Categorical(ground_truth_df["ground_truth"])
    st.pl.cluster_plot(data, use_label="ground_truth", cell_alpha=0.5)

../_images/tutorials_stSME_clustering_35_1.png

../_images/tutorials_stSME_clustering_35_3.png

../_images/tutorials_stSME_clustering_35_5.png

../_images/tutorials_stSME_clustering_35_7.png

../_images/tutorials_stSME_clustering_35_9.png

../_images/tutorials_stSME_clustering_35_11.png

../_images/tutorials_stSME_clustering_35_13.png

../_images/tutorials_stSME_clustering_35_15.png

../_images/tutorials_stSME_clustering_35_17.png

../_images/tutorials_stSME_clustering_35_19.png

../_images/tutorials_stSME_clustering_35_21.png

../_images/tutorials_stSME_clustering_35_23.png

def calculate_clustering_matrix(pred, gt, sample, methods_):
    df = pd.DataFrame(columns=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"])

    pca_ari = adjusted_rand_score(pred, gt)
    df = df.append(pd.Series([sample, pca_ari, "pca", methods_, "Adjusted_Rand_Score"],
                             index=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"]), ignore_index=True)

    pca_nmi = normalized_mutual_info_score(pred, gt)
    df = df.append(pd.Series([sample, pca_nmi, "pca", methods_, "Normalized_Mutual_Info_Score"],
                             index=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"]), ignore_index=True)

    pca_purity = purity_score(pred, gt)
    df = df.append(pd.Series([sample, pca_purity, "pca", methods_, "Purity_Score"],
                             index=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"]), ignore_index=True)

    pca_homogeneity, pca_completeness, pca_v_measure = homogeneity_completeness_v_measure(pred, gt)

    df = df.append(pd.Series([sample, pca_homogeneity, "pca", methods_, "Homogeneity_Score"],
                             index=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"]), ignore_index=True)

    df = df.append(pd.Series([sample, pca_completeness, "pca", methods_, "Completeness_Score"],
                             index=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"]), ignore_index=True)

    df = df.append(pd.Series([sample, pca_v_measure, "pca", methods_, "V_Measure_Score"],
                             index=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"]), ignore_index=True)
    return df

df = pd.DataFrame(columns=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"])

for i in range(12):
    sample = sample_list[i]
    GROUND_TRUTH_PATH = BASE_PATH / sample / "cluster_labels_{}.csv".format(sample)
    ground_truth_df = pd.read_csv(GROUND_TRUTH_PATH, sep=',', index_col=0)
    ground_truth_df.index = ground_truth_df.index.map(lambda x: x[7:])

    le = LabelEncoder()
    ground_truth_le = le.fit_transform(list(ground_truth_df["ground_truth"].values))
    ground_truth_df["ground_truth_le"] = ground_truth_le
    TILE_PATH = Path("/tmp/{}_tiles".format(sample))
    TILE_PATH.mkdir(parents=True, exist_ok=True)
    data = st.Read10X(BASE_PATH / sample)
    ground_truth_df = ground_truth_df.reindex(data.obs_names)
    n_cluster = len((set(ground_truth_df["ground_truth"]))) - 1
    data.obs['ground_truth'] = ground_truth_df["ground_truth"]
    ground_truth_le = ground_truth_df["ground_truth_le"]

    # pre-processing for gene count table
    st.pp.filter_genes(data,min_cells=1)
    st.pp.normalize_total(data)
    st.pp.log1p(data)

    # run PCA for gene expression data
    st.em.run_pca(data,n_comps=15)

    # pre-processing for spot image
    st.pp.tiling(data, TILE_PATH)

    # this step uses deep learning model to extract high-level features from tile images
    # may need few minutes to be completed
    st.pp.extract_feature(data)

    # stSME
    st.spatial.SME.SME_normalize(data, use_data="raw", weights="physical_distance")
    data_ = data.copy()
    data_.X = data_.obsm['raw_SME_normalized']

    st.pp.scale(data_)
    st.em.run_pca(data_,n_comps=15)

    st.tl.clustering.kmeans(data_, n_clusters=n_cluster, use_data="X_pca", key_added="X_pca_kmeans")
    st.pl.cluster_plot(data_, use_label="X_pca_kmeans")

    methods_ = "stSME_disk"
    results_df = calculate_clustering_matrix(data_.obs["X_pca_kmeans"], ground_truth_le, sample, methods_)
    df = df.append(results_df, ignore_index=True)

../_images/tutorials_stSME_clustering_38_6.png

../_images/tutorials_stSME_clustering_38_15.png

../_images/tutorials_stSME_clustering_38_24.png

../_images/tutorials_stSME_clustering_38_33.png

../_images/tutorials_stSME_clustering_38_44.png

../_images/tutorials_stSME_clustering_38_55.png

../_images/tutorials_stSME_clustering_38_66.png

../_images/tutorials_stSME_clustering_38_77.png

../_images/tutorials_stSME_clustering_38_88.png

../_images/tutorials_stSME_clustering_38_99.png

../_images/tutorials_stSME_clustering_38_110.png

../_images/tutorials_stSME_clustering_38_121.png

# read clustering results from other methods
pca_df = pd.read_csv("./stSME_matrices_other_methods.csv")

df_all = pca_df.append(df, ignore_index=True)

import seaborn as sns
import matplotlib.pyplot as plt
sns.set_style("white")
from matplotlib.patches import PathPatch

def adjust_box_widths(g, fac):
    """
 Adjust the widths of a seaborn-generated boxplot.
 """

    # iterating through Axes instances
    for ax in g.axes:

        # iterating through axes artists:
        for c in ax.get_children():

            # searching for PathPatches
            if isinstance(c, PathPatch):
                # getting current width of box:
                p = c.get_path()
                verts = p.vertices
                verts_sub = verts[:-1]
                xmin = np.min(verts_sub[:, 0])
                xmax = np.max(verts_sub[:, 0])
                xmid = 0.5*(xmin+xmax)
                xhalf = 0.5*(xmax - xmin)

                # setting new width of box
                xmin_new = xmid-fac*xhalf
                xmax_new = xmid+fac*xhalf
                verts_sub[verts_sub[:, 0] == xmin, 0] = xmin_new
                verts_sub[verts_sub[:, 0] == xmax, 0] = xmax_new

                # setting new width of median line
                for l in ax.lines:
                    if np.all(l.get_xdata() == [xmin, xmax]):
                        l.set_xdata([xmin_new, xmax_new])

class GridShader():
    def __init__(self, ax, first=True, **kwargs):
        self.spans = []
        self.sf = first
        self.ax = ax
        self.kw = kwargs
        self.ax.autoscale(False, axis="x")
        self.cid = self.ax.callbacks.connect('xlim_changed', self.shade)
        self.shade()
    def clear(self):
        for span in self.spans:
            try:
                span.remove()
            except:
                pass
    def shade(self, evt=None):
        self.clear()
        xticks = self.ax.get_xticks()
        xlim = self.ax.get_xlim()
        xticks = xticks[(xticks > xlim[0]) & (xticks < xlim[-1])]
        locs = np.concatenate(([[xlim[0]], xticks+0.5, [xlim[-1]]]))

        start = locs[1-int(self.sf)::2]
        end = locs[2-int(self.sf)::2]

        for s, e in zip(start, end):
            self.spans.append(self.ax.axvspan(s, e, zorder=0, **self.kw))

import seaborn as sns
fig = plt.figure(figsize=(15, 5))
a = sns.boxplot(x="Method", y="Score", hue="test",

               width=0.7,
               data=df_all)
a.set_xticklabels(a.get_xticklabels(), rotation=45)
sns.despine(left=True)
a.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
adjust_box_widths(fig, 0.7)
plt.autoscale()
gs = GridShader(a, facecolor="lightgrey", first=False, alpha=0.7)
plt.tight_layout()
#plt.savefig("./clustering_performace.png", dpi=300)
plt.show()

../_images/tutorials_stSME_clustering_42_1.png

stSME normalization & imputation effects(插补数据)

This tutorial shows the stSME normalization effect between of two scenarios: - (1) normal (without stSME) - (2) stSME applied on raw gene counts

In this tutorial we use Mouse Brain (Coronal) Visium dataset from 10x genomics website.

# import module
import stlearn as st
from pathlib import Path
st.settings.set_figure_params(dpi=180)

# specify PATH to data
BASE_PATH = Path("/home/uqysun19/60days/10x_visium/mouse_brain_coronal")

# spot tile is the intermediate result of image pre-processing
TILE_PATH = Path("/tmp/tiles")
TILE_PATH.mkdir(parents=True, exist_ok=True)

# output path
OUT_PATH = Path("/home/uqysun19/60days/stlearn_plot/mouse_brain_coronl")
OUT_PATH.mkdir(parents=True, exist_ok=True)
# load data
data = st.Read10X(BASE_PATH)

# pre-processing for gene count table
st.pp.filter_genes(data,min_cells=1)
st.pp.normalize_total(data)
st.pp.log1p(data)

# pre-processing for spot image
st.pp.tiling(data, TILE_PATH)

# this step uses deep learning model to extract high-level features from tile images
# may need few minutes to be completed
st.pp.extract_feature(data)

(1) normal (without stSME)

<pre>data_normal = data.copy()
</pre>

marker gene for CA3

i="Lhfpl1"
st.pl.gene_plot(data_normal, gene_symbols=i, size=3,
                    fname=str(OUT_PATH) + "/without_SME_{}".format(str(i)) + ".png")

cb = plt.colorbar(plot, aspect=10, shrink=0.5, cmap=cmap)

../_images/tutorials_stSME_comparison_11_1.png

marker gene for DG

i="Pla2g2f"
st.pl.gene_plot(data_normal, gene_symbols=i, size=3,
                    fname=str(OUT_PATH) + "/without_SME_{}".format(str(i)) + ".png")

../_images/tutorials_stSME_comparison_13_0.png

(2) stSME applied on raw gene counts

data_SME = data.copy()
# apply stSME to normalise log transformed data
st.spatial.SME.SME_normalize(data_SME, use_data="raw")
data_SME.X = data_SME.obsm['raw_SME_normalized']
st.em.run_pca(data_SME,n_comps=50)

marker gene for CA3

i="Lhfpl1"
st.pl.gene_plot(data_SME, gene_symbols=i, size=3,
                    fname=str(OUT_PATH) + "/without_SME_{}".format(str(i)) + ".png")

  cb = plt.colorbar(plot, aspect=10, shrink=0.5, cmap=cmap)

../_images/tutorials_stSME_comparison_17_1.png

marker gene for DG

i="Pla2g2f"
st.pl.gene_plot(data_SME, gene_symbols=i, size=3,
                    fname=str(OUT_PATH) + "/without_SME_{}".format(str(i)) + ".png")

../_images/tutorials_stSME_comparison_19_0.png

Spatial trajectory inference analysis tutorial(空间轨迹)

1. Preparation

We are trying to keep it focus on spatial trajectory inference then every step from reading data, processing to clustering, we will give the code here to easier for user to use.

Read and preprocess data

import stlearn as st
import scanpy as sc
sc.settings.verbosity = 3
st.settings.set_figure_params(dpi=120)
# Reading data
data = st.Read10X(path="path_to_BCBA")
# Save raw_count
data.raw = data.X
# Preprocessing
st.pp.filter_genes(data,min_cells=3)
st.pp.normalize_total(data)
st.pp.log1p(data)
st.pp.scale(data)

Clustering data

# Run PCA
st.em.run_pca(data,n_comps=50,random_state=0)
# Tiling image
st.pp.tiling(data,out_path="tiling",crop_size = 40)
# Using Deep Learning to extract feature
st.pp.extract_feature(data)
# Apply stSME spatial-PCA option
st.spatial.morphology.adjust(data,use_data="X_pca",radius=50,method="mean")
st.pp.neighbors(data,n_neighbors=25,use_rep='X_pca_morphology',random_state=0)
st.tl.clustering.louvain(data,random_state=0)

st.pl.cluster_plot(data,use_label="louvain",image_alpha=1,size=7)

../_images/tutorials_Pseudo-time-space-tutorial_8_0.png

st.add.annotation(data,label_list=['Fatty tissue,immune/lymphoid 1 MALAT1+',
                                   'Invasive cancer,fibrous tissue 1 CXCL14+',
                                   'Invasive cancer,fibrous tissue 2 CRISP3+',
                                   'Invasive cancer,fibrous tissue, fatty tissue',
                                   'Fatty tissue,immune/lymphoid 2 IGKC+',
                                   'Fibrous tissue',
                                   'Invasive cancer,fibrous tissue (DCIS)',
                                   'Fatty tissue, Fibrous tissue',
                                   'Invasive cancer,immune/lymphoid (IDC)' ,
                                   'Invasive cancer,fatty tissue 3 MUC5B+',
                                   'Fatty tissue'],
                 use_label="louvain")
st.pl.cluster_plot(data,use_label="louvain_anno",image_alpha=1,size=7)

../_images/tutorials_Pseudo-time-space-tutorial_9_1.png

2. Spatial trajectory inference

Choosing root

3733 is the index of the spot that we chose as root. It in the DCIS cluster (6). We recommend the root spot should be in the end/begin of a cluster in UMAP space. You can find min/max point of a cluster in UMAP as root.

data.uns["iroot"] = 3733
st.spatial.trajectory.pseudotime(data,eps=50,use_rep="X_pca",use_sme=True)

Spatial trajectory inference - global level

We run the global level of pseudo-time-space (PSTS) method to reconstruct the spatial trajectory between cluster 6 (DCIS) and 8 (lesions IDC)

st.spatial.trajectory.pseudotimespace_global(data,use_label="louvain",list_clusters=[6,8])
st.pl.cluster_plot(data,use_label="louvain",show_trajectories=True,list_clusters=[6,8],show_subcluster=True)

../_images/tutorials_Pseudo-time-space-tutorial_14_3.png

st.pl.trajectory.tree_plot(data)

../_images/tutorials_Pseudo-time-space-tutorial_15_0.png

Transition markers detection

Based on the spatial trajectory/tree plot, we can see 2 clades are started from sub-cluster 6 and 13. Then we run the function to detect the highly correlated genes with the PSTS values.

st.spatial.trajectory.detect_transition_markers_clades(data,clade=6,use_raw_count=False,cutoff_spearman=0.3)

st.spatial.trajectory.detect_transition_markers_clades(data,clade=13,use_raw_count=False,cutoff_spearman=0.3)

For the transition markers plot, genes from left side (red) are negatively correlated with the spatial trajectory and genes from right side (blue) are positively correlated with the spatial trajectory.

st.pl.trajectory.transition_markers_plot(data,top_genes=30,trajectory="clade_6")

../_images/tutorials_Pseudo-time-space-tutorial_20_0.png

st.pl.trajectory.transition_markers_plot(data,top_genes=30,trajectory="clade_13")

../_images/tutorials_Pseudo-time-space-tutorial_21_0.png

We also provide a function to compare the transition markers between two clades.

st.spatial.trajectory.compare_transitions(data,trajectories=("clade_6","clade_13"))

st.settings.set_figure_params(dpi=150)
st.pl.trajectory.DE_transition_plot(data,top_genes=10)

../_images/tutorials_Pseudo-time-space-tutorial_24_0.png

We can visualize some genes that different between two clades.

st.pl.gene_plot(data,gene_symbols="CTPS2",list_clusters=[6,8])

../_images/tutorials_Pseudo-time-space-tutorial_26_0.png

st.pl.gene_plot(data,gene_symbols="IFITM3",list_clusters=[6,8])

../_images/tutorials_Pseudo-time-space-tutorial_27_0.png

stLearn Cell-cell interaction analysis（空间临近通讯）

1. Data loading and preprocessing

import stlearn as st
import pandas as pd
import random
st.settings.set_figure_params(dpi=100)
# read in visium dataset downloaded from: support.10xgenomics.com/spatial-gene-expression/datasets/1.0.0/V1_Breast_Cancer_Block_A_Section_2
data = st.Read10X("[PATH_TO_DATASET]")
# st.add.image(adata=data, imgpath="C:\\Users\\uqjxu8\\GIH\\Bioinformatics\\SPA\\Data\\visium\\Human_Breast_Cancer_Block_A_Section_1\\spatial\\tissue_hires_nobg.png",
#             library_id="V1_Breast_Cancer_Block_A_Section_1",visium=True)
# preprocessing
st.pp.filter_genes(data,min_cells=3)
st.pp.normalize_total(data)
st.pp.scale(data)

2. Count cell type diversity (between-spots)

Read in the cell type predictions for each spot from Seurat label transfer results based on a labelled scRNA dataset

st.add.labels(data, 'tutorials/label_transfer_bc.csv', sep='\t')
st.pl.cluster_plot(data,use_label="predictions")

../_images/tutorials_stLearn-CCI_4_1.png

Count cell type diversity for between-spot mode

st.tl.cci.het.count(data, use_label='label_transfer')
st.pl.het_plot(data, use_het='cci_het')

../_images/tutorials_stLearn-CCI_6_1.png

3. Ligand-receptor co-expression (between-spots)

Read in user input LR pair

data.uns["lr"] = ['IL34_CSF1R']

Ligand-receptor co-expression in the neighbouring spots

st.tl.cci.lr(adata=data)
st.pl.het_plot(data, use_het='cci_lr', image_alpha=0.7)

Altogether 2 valid L-R pairs
L-R interactions with neighbours are counted and stored into adata.obsm['cci_lr']

../_images/tutorials_stLearn-CCI_11_1.png

4. Calculate merged CCI scores (between-spots)

st.tl.cci.merge(data, use_lr='cci_lr', use_het='cci_het')
st.pl.het_plot(data, use_het='merged', cell_alpha=0.7)

../_images/tutorials_stLearn-CCI_13_1.png

5. Permutation test (between-spots) (could be time consuming)

Permutation Run

st.tl.cci.permutation(data, use_het='cci_het', n_pairs=200)

Significance Test against null distribution

# plot the -log10(pvalue) from permutation test on each spot
st.pl.het_plot(data, use_het='merged_pvalues', cell_alpha=0.7)

../_images/tutorials_stLearn-CCI_18_0.png

Final significant hotspots

st.pl.het_plot(data, use_het='merged_sign', cell_alpha=0.7)

../_images/tutorials_stLearn-CCI_20_0.png

6. Count cell type diversity (within-spots)

st.tl.cci.het.count(data, use_label='label_transfer', distance=0)
st.pl.het_plot(data, use_het='cci_het')

../_images/tutorials_stLearn-CCI_22_1.png

7. Ligand-receptor co-expression (within-spot)

Read in user input LR pair

data.uns["lr"] = ['IL34_CSF1R']

Ligand-receptor co-expression within each spot

st.tl.cci.lr(adata=data, distance=0)
st.pl.het_plot(data, use_het='cci_lr', cell_alpha=0.7)

../_images/tutorials_stLearn-CCI_27_1.png

8. Calculate merged CCI scores (within-spot)

st.tl.cci.merge(data, use_lr='cci_lr', use_het='cci_het')
st.pl.het_plot(data, use_het='merged', cell_alpha=0.7)

../_images/tutorials_stLearn-CCI_29_1.png

9. Permutation test (within-spot) (could be time consuming)

Permutation Run

st.tl.cci.permutation(data, use_het='cci_het', n_pairs=200, distance=0)

Significance Test against null distribution

# plot the -log10(pvalue) from permutation test on each spot
st.pl.het_plot(data, use_het='merged_pvalues', cell_alpha=0.7)

../_images/tutorials_stLearn-CCI_34_0.png

Final significant hotspots

st.pl.het_plot(data, use_het='merged_sign', cell_alpha=0.7)

../_images/tutorials_stLearn-CCI_36_0.png

Core plotting functions(核心绘图功能)

Loading processed data

import stlearn as st
import scanpy as sc
st.settings.set_figure_params(dpi=120)

# Ignore all warnings
import warnings
warnings.filterwarnings("ignore")

# Read raw data
adata = sc.datasets.visium_sge()
adata = st.convert_scanpy(adata)

# Read processed data
adata_processed = st.datasets.example_bcba()

Gene plot

Here is the standard plot for gene expression, we provide 2 options for single genes and multiple genes:

st.pl.gene_plot(adata, gene_symbols="BRCA1")

../_images/tutorials_Core_plots_7_0.png

For multiple genes, you can combine multiple genes by 'CumSum'cummulative sum or 'NaiveMean'naive mean:

st.pl.gene_plot(adata, gene_symbols=["BRCA1","BRCA2"], method="CumSum")

../_images/tutorials_Core_plots_9_0.png

st.pl.gene_plot(adata, gene_symbols=["BRCA1","BRCA2"], method="NaiveMean")

../_images/tutorials_Core_plots_10_0.png

You also can plot genes with contour plot to see clearer about the distribution of genes:

st.pl.gene_plot(adata, gene_symbols="GAPDH", contour=True,cell_alpha=0.5)

../_images/tutorials_Core_plots_12_0.png

You can change the step_size to cut the range of display in contour

st.pl.gene_plot(adata, gene_symbols="GAPDH", contour=True,cell_alpha=0.5, step_size=200)

../_images/tutorials_Core_plots_14_0.png

Cluster plot

We provide different options for display clustering results. Several show_* options that user can control to display different parts of the figure:

st.pl.cluster_plot(adata_processed,use_label="louvain")

../_images/tutorials_Core_plots_17_0.png

st.pl.cluster_plot(adata_processed,use_label="louvain",show_cluster_labels=True,show_color_bar=False)

../_images/tutorials_Core_plots_18_0.png

Subcluster plot

We also provide option to plot spatial subclusters based on the spatial location within a cluster.

You have two options here, display subclusters for multiple clusters using show_subcluster in st.pl.cluster_plot or use st.pl.subcluster_plot to display subclusters within a cluster but with different color.

st.pl.cluster_plot(adata_processed,use_label="louvain",show_subcluster=True,show_color_bar=False, list_clusters=[6,8])

../_images/tutorials_Core_plots_21_0.png

st.pl.subcluster_plot(adata_processed, use_label="louvain", cluster = 6)

../_images/tutorials_Core_plots_22_0.png

Spatial trajectory plot

We provided st.pl.trajectory.pseudotime_plot to visualize PAGA graph that maps into spatial transcriptomics array.

st.pl.trajectory.pseudotime_plot(adata_processed, use_label="louvain",show_node=T)

../_images/tutorials_Core_plots_25_0.png

You can plot spatial trajectory analysis results with the node in each subcluster by show_trajectories and show_node parameters.

st.pl.cluster_plot(adata_processed,use_label="louvain",show_trajectories=True,
                   show_color_bar=True, list_clusters=[6,8], show_node=True)

../_images/tutorials_Core_plots_27_0.png

Cell-cell interaction plot

For the cell-cell interaction, you can display the result by st.pl.het_plot. We also provide an option to display the contour plot.

st.pl.het_plot(adata_processed, use_het="lr_pvalues")

../_images/tutorials_Core_plots_30_0.png

st.pl.het_plot(adata_processed, use_het="lr",contour=True,step_size=1)

../_images/tutorials_Core_plots_31_0.png

Spatial transcriptomics deconvolution visualization(单细胞空间联合)

SPOTlight

You can follow the tutorial of SPOTlight in their git repository: https://github.com/MarcElosua/SPOTlight

After done the tutorial, you can run this R code to get deconvolution_result.csv

# This is R code. You should run this after done SPOTlight tutorial

tmp = as.data.frame(decon_mtrx)
row.names(tmp) <- row.names(brain@meta.data)
write.csv(t(tmp[1:(length(tmp)-1)]),"deconvolution_result.csv")

Note that: - brain is the Seurat object of Spatial Transcriptomics - We save the decon_mtrx to .csv file as the input of our deconvolution visualization function

Seurat label transferring

Seurat provided a vignette about spatial transcriptomics analysis (https://satijalab.org/seurat/v3.2/spatial_vignette.html).

In the section: Integration with single-cell data, you can follow this to do the label transferring.

After that, you can run this code in R to get the deconvolution_result.csv

# This is R code. You should run this after done Integration with single-cell data tutorial

write.csv(predictions.assay@data[-nrow(predictions.assay@data),],"deconvolution_result.csv")

Other software

If you use other software, you should convert the result to this format:

deconvolution_result

图片.png

stLearn deconvolution visualization

Running the basic analysis step of stLearn

import stlearn as st
data = st.Read10X(path="BRAIN_PATH")
data.var_names_make_unique()
st.pp.filter_genes(data,min_cells=3)
st.pp.normalize_total(data)
st.pp.log1p(data)
st.pp.scale(data)

st.em.run_pca(data,n_comps=10,random_state=0)
st.pp.neighbors(data,n_neighbors=15,use_rep='X_pca',random_state=0)
st.tl.clustering.louvain(data,random_state=0)
st.pl.cluster_plot(data,use_label="louvain",image_alpha=1,size=7)

../_images/tutorials_ST_deconvolution_visualization_17_1.png

Add annotation from other software and visualize it into a scatter pie plot with donut chart

st.add.add_deconvolution(data,annotation_path="deconvolution_result.csv")

We also provide a threshold parameter that do the filtering based on quantile. The objective is removing the noise labels.

st.pl.deconvolution_plot(data,threshold=0.5)

../_images/tutorials_ST_deconvolution_visualization_21_0.png

st.pl.deconvolution_plot(data,threshold=0.0)

../_images/tutorials_ST_deconvolution_visualization_22_0.png

You also can examine the proportion of cell types in each cluster

deconvolution_plot(data,cluster=9)

../_images/tutorials_ST_deconvolution_visualization_24_0.png

生活很好，有你更好

最后编辑于：2021.12.30 17:00:21

禁止转载，如需转载请通过简信或评论联系作者。

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 220,295评论 6赞 512
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 93,928评论 3赞 396
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 166,682评论 0赞 357
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 59,209评论 1赞 295
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 68,237评论 6赞 397
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 51,965评论 1赞 308
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 40,586评论 3赞 420
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 39,487评论 0赞 276
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 46,016评论 1赞 319
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 38,136评论 3赞 340
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 40,271评论 1赞 352
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 35,948评论 5赞 347
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 41,619评论 3赞 331
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 32,139评论 0赞 23
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 33,252评论 1赞 272
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 48,598评论 3赞 375
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 45,267评论 2赞 358

10X单细胞空间回顾之stlearn

今天我们还有复习一下stlearn，很棒的软件

stSME clustering tutorial(空间平滑)

Mouse Brain (Coronal)

1. Preparation

2. run stSME clustering

Mouse Brain (Sagittal Posterior)

1. Preparation

2. run stSME clustering

Human Brain dorsolateral prefrontal cortex (DLPFC)

Ground truth

stSME normalization & imputation effects(插补数据)

(1) normal (without stSME)

marker gene for CA3

marker gene for DG

(2) stSME applied on raw gene counts

marker gene for CA3

marker gene for DG

Spatial trajectory inference analysis tutorial(空间轨迹)

1. Preparation

Read and preprocess data

Clustering data

2. Spatial trajectory inference

Choosing root

Spatial trajectory inference - global level

Transition markers detection

stLearn Cell-cell interaction analysis（空间临近通讯）

1. Data loading and preprocessing

2. Count cell type diversity (between-spots)

Read in the cell type predictions for each spot from Seurat label transfer results based on a labelled scRNA dataset

Count cell type diversity for between-spot mode

3. Ligand-receptor co-expression (between-spots)

Read in user input LR pair

Ligand-receptor co-expression in the neighbouring spots

4. Calculate merged CCI scores (between-spots)

5. Permutation test (between-spots) (could be time consuming)

Permutation Run

Significance Test against null distribution

Final significant hotspots

6. Count cell type diversity (within-spots)

7. Ligand-receptor co-expression (within-spot)

Read in user input LR pair

Ligand-receptor co-expression within each spot

8. Calculate merged CCI scores (within-spot)

9. Permutation test (within-spot) (could be time consuming)

Permutation Run

Significance Test against null distribution

Final significant hotspots

Core plotting functions(核心绘图功能)

Loading processed data

Gene plot

Cluster plot

Subcluster plot

Spatial trajectory plot

Cell-cell interaction plot

Spatial transcriptomics deconvolution visualization(单细胞空间联合)

SPOTlight

Seurat label transferring

Other software

stLearn deconvolution visualization

推荐阅读更多精彩内容