单细胞数据分析难免要对细胞分群下手，那让我们康康都有哪些基础知识吧：

一、基本函数：

FindNeighbors：
Constructs a Shared Nearest Neighbor (SNN) Graph for a given dataset. We first determine the k-nearest neighbors of each cell. We use this knn graph to construct the SNN graph by calculating the neighborhood overlap (Jaccard index) between every cell and its k.param nearest neighbors.

FindClusters：
Identify clusters of cells by a shared nearest neighbor (SNN) modularity optimization based clustering algorithm. First calculate k-nearest neighbors and construct the SNN graph. Then optimize the modularity function to determine clusters.

pbmc <- FindNeighbors(pbmc, dims = 1:10)

pbmc <- FindClusters(pbmc, resolution = 0.5)

levels（pbmc）

#[1] 0 1 2 3 4 5 6 7 8

二、函数使用：

FindClusters()函数
该函数是基于FindNeighbors()构建的SNN图来进行分群。其中参数 resolution 是设置下游聚类分群重要参数，该参数一般设置在0.3-1之间即可，还需针对每个单独的实验数据进行优化。分辨率值越高，簇的数量就越多，对于较大的数据集且复杂组织来说高分辨率能够区分更多的细胞。

resolution参数支持多个分辨率值输入，结果可以通过pbmc@metadata进行查看，每个分辨率的结果都有单独一列。

pbmc <- FindClusters(pbmc, resolution = c(0.4,0.5,0.6,0.8,1,))
head(pbmc@meta.data)
#                 orig.ident nCount_RNA nFeature_RNA percent.mt RNA_snn_res.0.5 seurat_clusters RNA_snn_res.0.4 RNA_snn_res.0.6 RNA_snn_res.0.8 RNA_snn_res.1
#AAACATACAACCAC-1     pbmc3k       2419          779  3.0177759               1               1               2               1               1             1
#AAACATTGAGCTAC-1     pbmc3k       4903         1352  3.7935958               3               2               3               3               2             2
#AAACATTGATCAGC-1     pbmc3k       3147         1129  0.8897363               1               1               2               1               1             1
#AAACCGTGCTTCCG-1     pbmc3k       2639          960  1.7430845               2               4               1               2               4             4
#AAACCGTGTATGCG-1     pbmc3k        980          521  1.2244898               6               7               6               6               7             7
#AAACGCACTGGTAC-1     pbmc3k       2163          781  1.6643551               1               1               2               1               1             1

之后对非线性降维结果可视化时可以通过 Idents()函数来指定分辨率。

# Assign identity of clusters

Idents(object=pbmc) <- "RNA_snn_res.1"

levels(pbmc)

#[1] "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"

Idents(object=pbmc) <- "RNA_snn_res.0.6"

levels(pbmc)

#[1] "0" "1" "2" "3" "4" "5" "6" "7" "8" "9"

Idents(object=pbmc) <- "RNA_snn_res.0.4"

levels(pbmc)

#[1] "0" "1" "2" "3" "4" "5" "6" "7" "8"

pbmc <- RunUMAP(pbmc, dims =1:10)

DimPlot(pbmc, reduction ="umap")

RNA_snn_res.0.4

RenameIdents()函数 : 细胞簇注释名更改

无论是通过已知Makergene还是单纯对cluster名字进行注释更改，都可以通过RenameIdents()函数来进行更改。

new.cluster.id <- c("A","B","C","D","E","F","G","H","I")

names(new.cluster.id) <- levels(pbmc)

pbmc <- RenameIdents(pbmc, new.cluster.id)

renameident

大家一起学习讨论鸭！

来一杯！

参考：
scRNA-seq Clustering
https://cloud.tencent.com/developer/article/1669057
scRNA-seq Clustering(二)
https://cloud.tencent.com/developer/article/1678071
Seurat - Guided Clustering Tutorial
https://satijalab.org/seurat/v3.1/pbmc3k_tutorial.html

最后编辑于：2020.12.24 16:54:52

禁止转载，如需转载请通过简信或评论联系作者。