流程优化----Spatial HD多样本整合分析 + 邻域分析

作者，Evil Genius

生活不易，且行且珍惜，指不定哪天就找不见这个人了。

上一篇我们分享了单样本的python版本的分析，其实比较简单，跟普通的visium区别不大，就是需要转换一个文件格式，在文章全流程更新----Spatial HD数据全流程更新（数据分析 + 图像识别）中我们实现了Spatial HD的数据分析 + 图像分割，从HD数据上拿到了单细胞级别的空间数据，这样的话，一些下游的分析，都非常有利，相当于拿到了Xenium的数据，而且是高通量的Xenium数据。

其中会生成很多文件，有一个文件就是单细胞级别的空间矩阵。

分析输出的目录

.
└── cache/
    └── <anaylsis_name> /
        ├── chunks/
        │   ├── bins_gdf/
        │   │   └── patch_<patch_id>.csv
        │   ├── cells_gdf/
        │   │   └── patch_<patch_id>.csv
        │   └── <bin_to_cell_method>/
        │       ├── bin_to_cell_assign/
        │       │   └── patch_<patch_id>.csv
        │       ├── cell_ix_lookup/
        │       │   └── patch_<patch_id>.csv
        │       └── <cell_annotation_method>_results/
        │           ├── cells_adata.csv
        │           └── merged_results.csv
        └── cells_df.csv

其中cells_gdf下面是Folder containing GeoPandas dataframes representing cells segmented in the tissue，bin_to_cell_assign下面是经过分割后的单细胞级别的空间矩阵，我们需要读取这个文件，h5数据，python更容易读取。

更为关键的是，在merged_results.csv有注释好的细胞类型的信息。

那么接下来，整合就很容易了，单细胞方法即可, 但是要结合空间信息，对服务器的性能要求也高一点，最好有GPU。

#!bin/python
####zhaoyunfei
####20240703
####https://stlearn.readthedocs.io/en/latest/tutorials/Integration_multiple_datasets.html

import scanpy as sc
import argparse
import pandas as pd
import scanpy.external as sce
import anndata as ad
import matplotlib.pyplot as plt
import numpy as np

parse=argparse.ArgumentParser(description='scanpy')
parse.add_argument('--samplelist',help='the fragmentlist of fragment file name,loomfile eg:fragment,samplename',type=str,required = True)
parse.add_argument('--outdir',help='the analysis dir',type=str)
parse.add_argument('--resolution',help='the resolution',type=float,default = 0.5)
argv = parse.parse_args()

samplelist = argv.samplelist
outdir = argv.outdir
resolution = argv.resolution

sample = pd.read_csv(samplelist,sep = ',')

samples = {}

for i in range(sample.shape[0]):

    key = sample.iloc[i,0]

    samples[key] = sample.iloc[i,1]

adatas = {}

for sample_id, filename in samples.items():
    sample_adata = sc.read_visium(filename)
    sample_adata.var_names_make_unique()
    adatas[sample_id] = sample_adata
    sample_adata.uns['spatial'][sample_id] =  sample_adata.uns['spatial'][list(sample_adata.uns['spatial'].keys())[0]]
    del sample_adata.uns['spatial'][list(sample_adata.uns['spatial'].keys())[0]]

整合

流程优化----Spatial HD多样本整合分析 + 邻域分析

作者，Evil Genius

生活不易，且行且珍惜，指不定哪天就找不见这个人了。

其中会生成很多文件，有一个文件就是单细胞级别的空间矩阵。

分析输出的目录

其中cells_gdf下面是Folder containing GeoPandas dataframes representing cells segmented in the tissue，bin_to_cell_assign下面是经过分割后的单细胞级别的空间矩阵，我们需要读取这个文件，h5数据，python更容易读取。

更为关键的是，在merged_results.csv有注释好的细胞类型的信息。

那么接下来，整合就很容易了，单细胞方法即可, 但是要结合空间信息，对服务器的性能要求也高一点，最好有GPU。

整合

推荐阅读更多精彩内容