1. Paper
Unsupervised removal of systematic background noise from droplet-based single-cell experiments using CellBender
2. GitHub
3. 官网
remove ambient (contaminative) RNA
使用:
https://cellbender.readthedocs.io/en/latest/usage/index.html
cellbender remove-background \ --input \ --output output.h5 \ --expected-cells \ --total-droplets-included \ --cuda
--input
: the input.h5
文件;通常是 Cellranger 输出的raw_feature_bc_matrix.h5
--output
: the name for the output
--expected-cells
: 预期捕获的细胞数量;大多数时间使用Cellranger report
中的数值;示例中①
位置对应的数值
--total-droplets-included
:超过这个数值便为surely empty
;示例中②
位置对应的数值
--cuda
: with this flag, GPU will be used for calculation
结果解读:
https://cellbender.readthedocs.io/en/latest/tutorial/index.html
https://cellbender.readthedocs.io/en/latest/reference/index.html#loading-outputs
output.h5
: 去除ambient RNAs
,保留全部的barcode
后的结果
output_filtered.h5
: 去除ambient RNAs
;The word “filtered” means that this file contains only the droplets which were determined to have a > 50% posterior probability of containing cells.
anndata_from_h5()
import cellbender as cbd scanpy_object_output = cbd.remove_background.downstream.anndata_from_h5('/path/output.h5')
output.h5
文件会被加载
- 只有
--total-droplets-included
这个数量的细胞会被保留(根据UMI number
进行排序)
- 在
.obs
中会有cell_probability
一列,根据默认,只有这一列>0.5
的细胞会被最终保留并作为最终分析的高质量细胞,这一结果保存在output_filtered.h5
中
- 降维以后的数据存放在
.obs['gene_expression_encoding']
,后续计算KNN
和tSNE
可以