gene ontology
在早两个月前,按照班长的RNAseq教程,把整个流程模仿了一遍,最终得出很多差异表达的基因,可是这些基因(基因产物)到底是干什么的?对细胞、对生物体到底有怎么的影响的呢?这就谈到了基因注释的问题。在GO之前,生物学家们对同一基因及基因产物在同一物种或不同物种的功能描述并没有统一的术语(Term),可以想象,那时候的注释工作有多么困难!
这时候,有人挺身而出了。1998年由研究三种模式(果蝇、小鼠和酵母)基因组的研究者共同发起组织了一个称为基因本体联盟的专业团队。创建基因本体的初衷是希望提供一个可具代表性的规范化的基因和基因产物特性的术语描绘或词义解释的工作平台,使生物信息学研究者对基因和基因产物的数据能够进行统一的归纳、处理、解释和共享。
How is the GO designed?
The GO project has developed three structured, independent sub-ontologies that describe gene products in a species-independent manner. The sub-ontologies are as follows:
- Cellular component (CC). Where does the product exhibit its effect? This ontology relates the gene product to a component of a cell, that is, part of a larger object, for example: cell, nucleus, Golgi membrane, SAGA complex
- Molecular function (MF). How does it work? Which biochemical mechanism characterizes the product? These terms describe activities that occur at the molecular level: lactase activity, actin binding
- Biological process (BP). What is the purpose of the gene product? The general rule to assist in distinguishing between a biological process and a molecular function is that a process must have more than one distinct step: cholesterol efflux, transport, mitotic prophase
gene ontology structrue
DAG
以下是个人对GO的理解
- 不同物种很多基因高度保守
- 不同物种的同一基因叫法不一
- 以上两点导致全基因组注释困难,同源基因对不上
因此,建立了一套hierarchical common controlled vocabulary,就是GO
有了这套词汇之后,大家的注释引用着预先定义的、官方的术语,才得以沟通和交流。
而因为GO的层级关系,才能超越单个基因或基因产物找出基因以上的类别。