大暑:一年中最热的时候。
在Hafemeister和Satija,2019年,我们基于正则化负二项式回归引入了一种改进的scRNA-seq标准化方法。该方法名为“ sctransform”,可避免标准规范化工作流程的某些陷阱,包括添加伪计数和对数转换。您可以在手稿或我们的SCTransform插图中阅读有关sctransform的更多信息。
下面,我们演示如何针对已使用sctransform工作流程标准化的数据集修改Seurat集成工作流程。这些命令在很大程度上相似,但有一些主要区别:
- 通过SCTransform()而不是NormalizeData()在集成之前分别对数据集进行规范化
- 正如我们在SCTransform插图中进一步讨论的那样,我们通常使用3,000或更多功能在sctransform的下游进行分析。
- PrepSCTIntegration()在确定锚点之前运行该功能
- 运行FindIntegrationAnchors()
和时IntegrateData(),将normalization.method参数设置为值SCT`。 - 在运行基于sctransform的工作流(包括集成)时,请勿运行该ScaleData()`功能
LoadData("ifnb")
ifnb.list <- SplitObject(ifnb, split.by = "stim")
ifnb.list <- lapply(X = ifnb.list, FUN = SCTransform)
features <- SelectIntegrationFeatures(object.list = ifnb.list, nfeatures = 3000)
ifnb.list <- PrepSCTIntegration(object.list = ifnb.list, anchor.features = features)
immune.anchors <- FindIntegrationAnchors(object.list = ifnb.list, normalization.method = "SCT",
anchor.features = features)
immune.combined.sct <- IntegrateData(anchorset = immune.anchors, normalization.method = "SCT")
immune.combined.sct <- RunPCA(immune.combined.sct, verbose = FALSE)
immune.combined.sct <- RunUMAP(immune.combined.sct, reduction = "pca", dims = 1:30)
p1 <- DimPlot(immune.combined.sct, reduction = "umap", group.by = "stim")
p2 <- DimPlot(immune.combined.sct, reduction = "umap", group.by = "seurat_annotations", label = TRUE,
repel = TRUE)
p1 + p2

image
现在已经集成了数据集,您可以按照此小插图中的先前步骤确定单元格类型和特定于单元格类型的响应。
library(Seurat)
> library(Seurat)
> library(SeuratData)
> library(patchwork)
> # install dataset
> #InstallData("ifnb")
> LoadData("ifnb")
An object of class Seurat
14053 features across 13999 samples within 1 assay
Active assay: RNA (14053 features, 0 variable features)
> ifnb.list <- SplitObject(ifnb, split.by = "stim")
> ifnb.list <- lapply(X = ifnb.list, FUN = SCTransform)
Calculating cell attributes from input UMI matrix: log_umi
Variance stabilizing transformation of count matrix of size 12747 by 6548
Model formula is y ~ log_umi
Get Negative Binomial regression parameters per gene
Using 2000 genes, 6548 cells
|===============================================================================| 100%
Found 82 outliers - those will be ignored in fitting/regularization step
Second step: Get residuals using fitted parameters for 12747 genes
|===============================================================================| 100%
Computing corrected count matrix for 12747 genes
|===============================================================================| 100%
Calculating gene attributes
Wall clock passed: Time difference of 2.109865 mins
Determine variable features
Set 3000 variable features
Place corrected count matrix in counts slot
Centering data matrix
|===============================================================================| 100%
Set default assay to SCT
Calculating cell attributes from input UMI matrix: log_umi
Variance stabilizing transformation of count matrix of size 12658 by 7451
Model formula is y ~ log_umi
Get Negative Binomial regression parameters per gene
Using 2000 genes, 7451 cells
|===============================================================================| 100%
Found 78 outliers - those will be ignored in fitting/regularization step
Second step: Get residuals using fitted parameters for 12658 genes
|===============================================================================| 100%
Computing corrected count matrix for 12658 genes
|===============================================================================| 100%
Calculating gene attributes
Wall clock passed: Time difference of 2.335794 mins
Determine variable features
Set 3000 variable features
Place corrected count matrix in counts slot
Centering data matrix
|===============================================================================| 100%
Set default assay to SCT
> ifnb.list <- PrepSCTIntegration(object.list = ifnb.list, anchor.features = features)
| | 0 % ~calculating Error in scale.data[anchor.features, ] : 下标出界
> #慢
> features <- SelectIntegrationFeatures(object.list = ifnb.list, nfeatures = 2000)
> ifnb.list <- PrepSCTIntegration(object.list = ifnb.list, anchor.features = features)
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=04s
> immune.anchors <- FindIntegrationAnchors(object.list = ifnb.list, normalization.method = "SCT",
+ anchor.features = features)
Finding all pairwise anchors
| | 0 % ~calculating Running CCA
Merging objects
Finding neighborhoods
Finding anchors
Found 14903 anchors
Filtering anchors
Retained 11743 anchors
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=02m 56s
> immune.combined.sct <- IntegrateData(anchorset = immune.anchors, normalization.method = "SCT")
Merging dataset 1 into 2
Extracting anchors for merged samples
Finding integration vectors
Finding integration vector weights
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Integrating data
Warning: Adding a command log without an assay associated with it
> immune.combined.sct <- RunPCA(immune.combined.sct, verbose = FALSE)
> immune.combined.sct <- RunUMAP(immune.combined.sct, reduction = "pca", dims = 1:30)
11:02:19 UMAP embedding parameters a = 0.9922 b = 1.112
11:02:19 Read 13999 rows and found 30 numeric columns
11:02:19 Using Annoy for neighbor search, n_neighbors = 30
11:02:19 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
11:02:22 Writing NN index file to temp file C:\Users\zzu\AppData\Local\Temp\RtmpO0TeZF\file346443924fc5
11:02:22 Searching Annoy index using 1 thread, search_k = 3000
11:02:27 Annoy recall = 100%
11:02:27 Commencing smooth kNN distance calibration using 1 thread
11:02:29 Initializing from normalized Laplacian + noise
11:02:29 Commencing optimization for 200 epochs, with 603096 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
11:02:48 Optimization finished
> p1 <- DimPlot(immune.combined.sct, reduction = "umap", group.by = "stim")
> p2 <- DimPlot(immune.combined.sct, reduction = "umap", group.by = "seurat_annotations", label = TRUE,
+ repel = TRUE)
> p1 + p2
Error in grid.Call(C_convert, x, as.integer(whatfrom), as.integer(whatto), :
Viewport has zero dimension(s)