several facts of SCENIC

1. GENIE3对于输入的expression matrix,既可以是UMI counts,也可以是library-size normalized counts,两者的结果相近。

SCENIC: single-cell regulatory network inference and clustering

To evaluate to what extent the normalization of the input matrix affects the output of SCENIC, we also ran SCENIC on the Zeisel et al.9 data set after library-size normalization (using the standard pipeline from scran27, which performs within-cluster size-factor normalization). The results are highly comparable, both in regards to resulting clusters or cell types (ARI between the cell types obtained from raw UMI counts or normalized counts: 0.90, ARI from normalized counts compared to the author's cell types: 0.87) and to the TFs identifying the groups (26 out of the 30 regulons highlighted in Fig. 1b). Furthermore, during the course of this project we have applied GENIE3 to multiple data sets, some of them having UMI counts (e.g., mouse brain and oligodendrocytes) and others TPM (e.g., human brain and melanoma), and both units provided reliable results.

2. SCENIC详细流程:

Running SCENIC (htmlpreview.github.io)
其中:

## If launched in a new session, you will need to reload...
# setwd("...")
# loomPath <- "..."
# loom <- open_loom(loomPath)
# exprMat <- get_dgem(loom)
# close_loom(loom)
# genesKept <- loadInt(scenicOptions, "genesKept")
# exprMat_filtered <- exprMat[genesKept,]
# library(SCENIC)
# scenicOptions <- readRDS("int/scenicOptions.Rds")

# Optional: add log (if it is not logged/normalized already)
exprMat_filtered <- log2(exprMat_filtered+1) 

# Run GENIE3
runGenie3(exprMat_filtered, scenicOptions)

似乎是使用的normalized counts。

3. SCENIC不检测抑制性regulons

SCENIC: single-cell regulatory network inference and clustering

To build the final regulons, we merge the predicted target genes of each TF module that show enrichment of any motif of the given TF. To detect repression, it is theoretically possible to follow the same approach with the negative-correlated TF modules. However, in the data sets we analyzed, these modules were less numerous and showed very low motif enrichment. For this reason, we finally decided to exclude the detection of direct repression from the workflow and continue only with the positive-correlated targets. The databases used for the analyses presented in this paper are the “18k motif collection” from iRegulon (gene-based motif rankings) for human and mouse. For each species, we used two gene-motif rankings (10 kb around the TSS or 500 bp upstream the TSS), which determine the search space around the transcTSS.

4. pySCENIC的输出:reg.csv文件包含regulon及其target genes结果。reg.csv每一行代表一个motif及对应的target genes。一个regulon可能对应多个motif。SCENIC流程中将所有motif的target genes做并集,然后用AUCell计算评分。

How to get the list of target genes for one regulon from the output regulon.csv file of ctx · Issue #301 · aertslab/pySCENIC (github.com)
SCENIC: single-cell regulatory network inference and clustering

5. pySCENIC使用的计算regulatory network的软件为GRNBoot2,是SCENIC中的升级版本,文章描述效果比GENIE3更好。

GRNBoost2 and Arboreto: efficient and scalable inference of gene regulatory networks | Bioinformatics | Oxford Academic (oup.com)

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

友情链接更多精彩内容