API: Tools
sc.tl.pca()
此函数使用方法同
sc.pp.pca(),此函数本身已废弃https://scanpy.readthedocs.io/en/stable/generated/scanpy.pp.pca.html#scanpy.pp.pca
PCA, linear dimensional reduction
sc.pp.pca(data=, svd_solver="arpack")
n_comps=: number of PC to be caculated
layer=: expression matrix used;default=None,.Xwill be used
svd_solver=: SVD solver to be used结果存放在
.obsm中,使用.obsm['X_pca']来提取
sc.tl.tsne()
https://scanpy.readthedocs.io/en/stable/generated/scanpy.tl.tsne.html#scanpy.tl.tsne
tSNE
sc.tl.tsne(adata, n_pcs=, random_state=, use_rep=)
n_pcs=: number of top PCs to use;default=None, all genes will be used
random_state=: 设置随机种子数,以保证每次结果的一致性
use_rep=:keyin.obsmbe used;default=None,.obsm.X_pcawill be used; after data integration,.obsm.X_pca_harmonyshould be assigned结果存放在
.obsm中,使用.obsm['X_tsne']来提取
sc.tl.umap()
https://scanpy.readthedocs.io/en/stable/generated/scanpy.tl.umap.html
UMAP
sc.tl.umap(adata, min_dist=, spread=, random_state=)
min_dist=: 最终降维结果的紧密程度;数值越小,越紧密,有利于显示局部结果;数值越大越松散,有利于显示整体结果;default=0.5; 取值介于[0, 1]之间
spread=: 与min_dist=结合使用;控制最终点的散布程度;数值越小,点越集中;default=1
random_state=: 设置随机种子数,以保证每次结果的一致性识别细胞亚群:
sc.tl.umap(adata, min_dist=0.1, spread=1.0, random_state=149)
整体结构分析:sc.tl.umap(adata, min_dist=0.8, spread=2.0, random_state=149)结果存放在
.obsm中,使用.obsm['X_umap']来提取
sc.tl.leiden()
https://scanpy.readthedocs.io/en/stable/generated/scanpy.tl.leiden.html#scanpy.tl.leiden
Cluster cells using the Leiden algorithm
sc.tl.leiden(adata, resolution=, random_state=)
resolution=: 用来控制分辨率;default=1
random_state=: 设置随机种子数,以保证每次结果的一致性
sc.tl.leiden(adata, resolution=1, random_state=149)
sc.tl.rank_genes_groups()
Expects logarithmized data.
https://scanpy.readthedocs.io/en/stable/generated/scanpy.tl.rank_genes_groups.html#scanpy.tl.rank_genes_groupscall the DEGs of every cluster:
sc.tl.rank_genes_groups(scanpy_object, groupby=, method=, n_genes=, use_raw=, layer=)
groupby=: 分组依据;按照clustering结果分组则是'leiden'
method=: 用来进行计算的方法;default=t-tset
n_genes=: number of gene to return;degault=None, need to be assigned
use_raw=: use the.raw.Xfor caculation;default=None,.raw.Xwill be used
layer=:keyof.layerto be used for caculated
sc.tl.rank_genes_groups(scanpy_object, groupby='leiden', method='t-test', n_genes=20, use_raw=True, layer=None)计算数据存放在
.uns中数据提取并将格式转换为
DataFrame:>>> kk=pd.DataFrame(scanpy_object.uns['rank_genes_groups']['names']) '#提取基因名 >>> display(kk.shape) >>> kk.head() #示例一,Gene names >>> pp=pd.DataFrame(scanpy_object.uns['rank_genes_groups']['pvals']) #提取p-values >>> display(pp.shape) >>> pp.head() #示例二,P-values结果中的 logfoldchanges 为
log2()Gene names
P-valuesDifferential expression analysis between two celltypes:
sc.tl.rank_genes_groups(scanpy_object, groupby='column_name', groups=['celltype_name_1'], reference='celltype_name_2', method='wilcoxon', layer='log1p')NOTE: 结果是
groups=vs.reference=提取差异表达后的结果:
deg_results = sc.get.rank_genes_groups_df(scanpy_oject, group="celltype_name_1")NOTE: 结果是
groups=vs.reference=


