最近在复习HGVS命名,当做个人翻译笔记吧。
HGVS,全称是Human Genome Variation Society,人类基因组变异协会的缩写。
本周翻译的是第一部分General,原文地址http://varnomen.hgvs.org/recommendations/general/
【General】
使用字母前缀来指示所使用的参考序列的类型:
“g.” for a genomic reference sequence用于基因组参考序列
“c.” for a coding DNA reference sequence用于编码DNA参考序列
“n.” for a non-coding DNA reference sequence用于非编码DNA参考序列
“r.” for an RNA reference sequence (transcript)用于RNA参考序列(转录本)
“p.” for a protein reference sequence用于蛋白质参考序列
3’规则:对于突变的所有描述,最靠近参考序列3’端的描述优先考虑;应用于所有关于基因组、基因、转录本、蛋白的相关突变描述。
举例:上图基因突变后序列为CGATTTGC,可以有4种突变可能:丢失63/64/65;丢失64/65/66;丢失65/66/67;丢失66/67/68。按照3’靠右规则,标准命名为c.66_68del
当某一突变有多种描述方法时,优先级顺序:(1) deletion缺失 (2) inversion颠换 (3) duplication重复 (4) conversion替换 (5) insertion插入
【特定含义的字符】
“+” (plus) is used in nucleotide numbering; c.123+45A>G
“-” (minus) is used in nucleotide numbering; c.124-56C>T
“*” (asterisk) is used in nucleotide numbering and to indicate a translation termination (stop) codon (see Standards); c.*32G>A and P.Trp41*
“_” (underscore) is used to indicate a range; g.12345_12678del
“[ ]” (angled brackets) are used for alleles (see DNA, RNA, protein)
“;” (semi colon) is used to separate variants and alleles; g.[123456A>G;345678G>C] or g.[123456A>G];[345678G>C]
“,” (comma) is used to separate different transcripts/proteins derived from one allele; r.[123a>t, 122_154del]
“:” (colon) is used to separate the reference sequence file identifier (accession.version_number) from the actual description of a variant; NC_000011.9:g.12345611G>A
“( )” (parentheses) are used to indicate uncertainties and predicted consequences; NC_000023.9:g.(123456_234567)_(345678_456789)del, p.(Ser123Arg)
NOTE: the range of the uncertainty should be described as precisely as possible (see below)
“?” (question mark) is used to indicate unknown positions (nucleotide or amino acid); g.(?_234567)_(345678_?)del
“^” (caret) is used as “or”; c.(370A>C^372C>R) as back translation of p.Ser124Arg
“>” (greater than) is used to describe substitution variants (DNA and RNA level); g.12345A>T, r.123a>u (see DNA, RNA)
“{ }” (curly braces) suggested for the description of variants in otherwise perfect copy sequences (see Open Issues); g.24_65dup{46G>T}
“=” (equals) is used to indicate a sequence was tested but found unchanged; p.(Arg234=)
“/” (forward slash) is used to indicate mosaicism (see Complex (HGVS/ISCN))
“//” (double forward slash) is used to indicate chimerism (see Complex (HGVS/ISCN))
“+”(加号)用于核苷酸编号; c.123+45A>T
“-”(减号)用于核苷酸编号; c.124-56C>T
“*”(星号)用于核苷酸编号,并表示翻译终止(停止)密码子; c.*32G>A和p.Trp41*
“_”(下划线)用于指示范围; g.12345_12678del
“[ ]”(方括号)用于等位基因(见于DNA,RNA,蛋白质)
“;”(封号)用于隔开变异和等位基因; g.[123456A>G; 345678G>C]或g.[123456A>G]; [345678G>C]
“,”(逗号)用于隔开衍生自一个等位基因的不同转录物/蛋白质; r.[123a> t,122_154del]
“:”(冒号)用于将参考序列文件标识符(accession.version_number)与变异的实际描述分开; NC_000011.9:g.12345611G>A
“( )”(括号)用于表示不确定和预测的结果; NC_000023.9:g.(123456_234567)_(345678_456789)del,p.(Ser123Arg)注意:不确定区域的范围应尽可能精确地描述
“?”(问号)用于表示未知位置(核苷酸或氨基酸); g.(?_234567)_(345678_?)del
“^”(插入符号)用作“或”; c.(370A>C^372C>R)作为p.Ser124Arg的反向翻译
“>”(大于)用于描述替代的变异(DNA和RNA水平); g.12345A>T,r.123a>u
“{ }”(大括号)建议对其他完美复制序列中的变异进行描述; g.24_65dup{46G>T}
“=”(等于)用于表示序列已检测,但未发现改变;p.(Arg234=)
“/”(前斜杠)用于指示嵌合体mosaicism同源
“//”(双斜杠)用于指示嵌合体chimerism异源
【变异描述的缩写】
“>” (greater then) indicates a substitution (DNA and RNA level); g.123456G>A, r.123c>u (see DNA, RNA)
a substitution at the protein level is described as p.Ser321Arg (see protein)
“del” indicates a deletion; c.76delA (see DNA, RNA, protein)缺失
“dup” indicates a duplication; c.76dupA (see DNA, RNA, protein)重复
“ins” indicates an insertion; c.76_77insG (see DNA, RNA, protein)插入
duplicating insertions are described as duplications, not as insertions重复的优先级高于插入
“inv” indicates an inversion; c.76_83inv (see DNA, RNA). Not used at protein level, usually described as “delins”颠换
“con” indicates a conversion; NC_000022.10:g.42522624_42522669con42536337_42536382 (see DNA, RNA, protein)替换
“fs” indicates a frame shift; p.Arg456GlyfsTer17 (or p.Arg456Glyfs*17, see Frame shifts)移码
“ext” indicates an extension; p.Met1ext-5 (see Frame shifts)延长,突变激活上游-5位的翻译起始位点,翻译的肽链向起始密码子上游延长了5个氨基酸