问题描述
有时候我们想知道与某一个GO注释分类相关的基因有哪些,那么我们就需要一种方法将注释到这个GO term所有的基因提取出来
解决方案
在搜索一轮后,发现可以通过以下代码解决:
library(tidyverse)
library(org.Hs.eg.db)
GOgeneID <- get(GOID, org.Hs.egGO2ALLEGS) %>% mget(org.Hs.egSYMBOL) %>% unlist()
下面用DNA 复制(GO:0006260)这一生物学过程为例子,使用人源的GO注释进行展开
library(tidyverse)
library(org.Hs.eg.db)
# GO ID --> gene entrez ID
DNA_geneID <- get('GO:0006260', org.Hs.egGO2ALLEGS)
> head(DNA_geneID)
TAS IEA TAS IMP TAS ISS
"94" "466" "472" "545" "545" "546"
> length(DNA_geneID)
[1] 421
org.Hs.egGO2ALLEGS
包含GO ID与 Entrez ID之间的对应关系,输出的结果中还标注了该基因的注释证据程度,包括以下分类 :
IMP: inferred from mutant phenotype
IGI: inferred from genetic interaction
IPI: inferred from physical interaction
ISS: inferred from sequence similarity
IDA: inferred from direct assay
IEP: inferred from expression pattern
IEA: inferred from electronic annotation
TAS: traceable author statement
NAS: non-traceable author statement
ND: no biological data available
IC: inferred by curator
详细分类结果可以到以下网址查询:
http://geneontology.org/docs/guide-go-evidence-codes/
进一步我们还可以将Entrez ID转换为Symbol
DNA_geneSYMBOL <- mget(DNA_geneID, org.Hs.egSYMBOL) %>% unlist()
> head(DNA_geneSYMBOL)
94 466 472 545 545 546
"ACVRL1" "ATF1" "ATM" "ATR" "ATR" "ATRX"
完。
ref
https://davetang.org/muse/2011/05/20/extract-gene-names-according-to-go-terms/
https://www.ebi.ac.uk/QuickGO/term/GO:0006260
http://geneontology.org/docs/guide-go-evidence-codes/