scanpy.tl

API: Tools

sc.tl.pca()

此函数使用方法同 sc.pp.pca(),此函数本身已废弃

https://scanpy.readthedocs.io/en/stable/generated/scanpy.pp.pca.html#scanpy.pp.pca

PCA, linear dimensional reduction

sc.pp.pca(data=, svd_solver="arpack")

    1. n_comps=: number of PC to be caculated
    1. layer=: expression matrix used; default=None, .X will be used
    1. svd_solver=: SVD solver to be used

结果存放在 .obsm 中,使用 .obsm['X_pca'] 来提取

sc.tl.tsne()

https://scanpy.readthedocs.io/en/stable/generated/scanpy.tl.tsne.html#scanpy.tl.tsne

tSNE

sc.tl.tsne(adata, n_pcs=, random_state=, use_rep=)

    1. n_pcs=: number of top PCs to use; default=None, all genes will be used
    1. random_state=: 设置随机种子数,以保证每次结果的一致性
    1. use_rep=: key in .obsm be used; default=None, .obsm.X_pca will be used; after data integration, .obsm.X_pca_harmony should be assigned

结果存放在 .obsm 中,使用 .obsm['X_tsne'] 来提取

sc.tl.umap()

https://scanpy.readthedocs.io/en/stable/generated/scanpy.tl.umap.html

UMAP

sc.tl.umap(adata, min_dist=, spread=, random_state=)

    1. min_dist=: 最终降维结果的紧密程度;数值越小,越紧密,有利于显示局部结果;数值越大越松散,有利于显示整体结果;default=0.5; 取值介于 [0, 1] 之间
    1. spread=: 与 min_dist= 结合使用;控制最终点的散布程度;数值越小,点越集中;default=1
    1. random_state=: 设置随机种子数,以保证每次结果的一致性

识别细胞亚群sc.tl.umap(adata, min_dist=0.1, spread=1.0, random_state=149)
整体结构分析sc.tl.umap(adata, min_dist=0.8, spread=2.0, random_state=149)

结果存放在 .obsm 中,使用 .obsm['X_umap'] 来提取

sc.tl.leiden()

https://scanpy.readthedocs.io/en/stable/generated/scanpy.tl.leiden.html#scanpy.tl.leiden

Cluster cells using the Leiden algorithm

sc.tl.leiden(adata, resolution=, random_state=)

    1. resolution=: 用来控制分辨率;default=1
    1. random_state=: 设置随机种子数,以保证每次结果的一致性

sc.tl.leiden(adata, resolution=1, random_state=149)

sc.tl.rank_genes_groups()

Expects logarithmized data.
https://scanpy.readthedocs.io/en/stable/generated/scanpy.tl.rank_genes_groups.html#scanpy.tl.rank_genes_groups

call the DEGs of every cluster:

sc.tl.rank_genes_groups(scanpy_object, groupby=, method=, n_genes=, use_raw=, layer=)

    1. groupby=: 分组依据;按照 clustering 结果分组则是 'leiden'
    1. method=: 用来进行计算的方法;default=t-tset
    1. n_genes=: number of gene to return; degault=None, need to be assigned
    1. use_raw=: use the .raw.X for caculation; default=None, .raw.X will be used
    1. layer=: key of .layer to be used for caculated

sc.tl.rank_genes_groups(scanpy_object, groupby='leiden', method='t-test', n_genes=20, use_raw=True, layer=None)

计算数据存放在 .uns

数据提取并将格式转换为 DataFrame:

>>> kk=pd.DataFrame(scanpy_object.uns['rank_genes_groups']['names'])     '#提取基因名
>>> display(kk.shape)
>>> kk.head()     #示例一,Gene names

>>> pp=pd.DataFrame(scanpy_object.uns['rank_genes_groups']['pvals'])     #提取p-values
>>> display(pp.shape)
>>> pp.head()     #示例二,P-values

结果中的 logfoldchangeslog2()

Gene names

P-values

Differential expression analysis between two celltypes:

sc.tl.rank_genes_groups(scanpy_object, groupby='column_name', groups=['celltype_name_1'], reference='celltype_name_2', method='wilcoxon', layer='log1p')

NOTE: 结果是 groups= vs. reference=

提取差异表达后的结果:

deg_results = sc.get.rank_genes_groups_df(scanpy_oject, group="celltype_name_1")

NOTE: 结果是 groups= vs. reference=

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

友情链接更多精彩内容