ACM TechNews摘要(3)——细胞基因表达数据的流形学习软件

关键词:高维数据处理,流形学习



采用流形学习算法(tSNE)得到的细胞分类图

亥姆霍兹慕尼黑中心(德国健康与环境中心)的研究员开发了一款机器学习软件 Scanpy(大概是scan+python), 用于管理超庞大数据库,也是人类细胞云图计划(Human Cell Atlas)的候选分析工具之一。

慕尼黑大学教授Fabian Theis说:“对存在组合数据库的类似项目来说,分析软件具有可升级性(scalable)至关重要”。他认为Scanpy毋庸置疑有助于人类细胞云图的分析工作。而Scanpy的发布,代表着整合众多机器学习和统计方法的针对基因表达大数据库的综合分析软件的首秀。架构方面,传统生物统计学项目使用R语言编写的分析系统,但Scanpy的开发是基于机器学习领域的主导语言—Python,并采用基于图形识别的算法分析成像流式细胞仪数据(imaging flow cytometry data),避免荧光染色造成的数据缺失。与传统分析方法相同,Scanpy的分析使用图形坐标系而非基因表达坐标系,使用最近邻识别来刻画细胞而非直接的基因表达数值。细胞分类算法类似于Facebook所采用的社交群体识别算法。

"For this project, and in a growing number of other projects in which databases are combined, it is important to have scalable software," says University of Munich professor Fabian Theis. He notes it is therefore no surprise that Scanpy is a candidate for helping to analyze the Human Cell Atlas. Theis says the publication of Scanpy represents the first time software has been developed to enable comprehensive analysis of large gene-expression datasets with a broad range of machine learning and statistical methods. Scanpy is based on the Python language, the dominant language in the machine-learning community. In addition, Theis says Scanpy relies on graph-based algorithms, differentiating the system from other biostatistics programs, which are traditionally written in the R programming language. Unlike the usual approach of regarding cells as points in a coordinate system within gene-expression space, the algorithms use a graph-like coordinate system. Instead of characterizing a single cell by the expression value for thousands of genes, the system simply characterizes cells by identifying their closest neighbors - very much like the connections in social networks. In fact, to identify cell types, Scanpy uses the same algorithms as Facebook does for identifying communities.

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • rljs by sennchi Timeline of History Part One The Cognitiv...
    sennchi阅读 7,488评论 0 10
  • 雪覆苍茫地 冰封腊月天 冬虽不舍去 春已在心间
    沙鸥_f44c阅读 193评论 0 0
  • 姓名:寓目 生日:1985.06.08 学历:本科 专业:电商经理、人事管理、财务管理、项目管理、项目执行、行政总...
    晓暐阅读 378评论 0 0
  • 花卷,主食,管饱不易饿。和面的时候要注意水量控制,为保持花卷的挺拔,需要面剂硬一点。 均匀地把红豆沙摊平到擀好的面...
    暮雨潇潇阅读 954评论 6 5