-
当使用snpEff创建数据库的时候使用以下命令
java -jar snpEff.jar build -gtf22 -v Oar_rambouillet_v1.0
-
突然报错
java.lang.RuntimeException: Error: Cannot find first coding exon for transcript:
PEKD01005337.1:-2277-18238, strand: -, id:ENSOART00020027463, bioType:protein_coding, Protein
5'UTR : PEKD01005337.1 18068-18238 UTR_5_PRIME 'UTR5_PEKD01005337.1_18069_18239'
Exons:
PEKD01005337.1:-2277--2049 'ENSOARE00020149813', rank: 2, frame: ., sequence: ctttgtgctataaaggccactcccatgacatacagggaagaggctcagttaaccaatttctaataaccaaatccacagccaacacggaattcctcccggaacctgggacctttataaagcggcattcgcagcctcttctccagcatcacctgcagagctcgtgacgccaacatgaggctccatcacctgctcctcgtgctcttcttcgtggtcctgtctgctgggtcag
PEKD01005337.1:4663-4914 'ENSOARE00020149859', rank: 1, frame: 2, sequence: gatttactcatggagtaacagatagtctaagctgccgttggaagaaaggcatctgtgtgctgaccaggtgccctggaaccatgagacagattggcacctgtttcgggcccccagtaaaatgctgcagactgaagtaacagaaggcgaagacgcggccggaccgatgcggagtcagaaactgcgtccttagacagagcgtctaaaatttaaaccagaaataaattttgtttcaaagttaaagaatcttgccca
3'UTR : PEKD01005337.1 4663-4777 UTR_3_PRIME 'UTR3_PEKD01005337.1_4664_4778'
CDS : tcagaaactgcgtccttagacagagcgtctaaaatttaaaccagaaataaattttgtttcaaagttaaagaatcttgcccactttgtgctataaaggccactcccatgacatacagggaagaggctcagttaaccaatttctaataaccaaatccacagccaacacggaattcctcccggaacctgggaccttta
Protein : SETASLDRASKI*TRNKFCFKVKESCPLCAIKATPMTYREEAQLTNF**PNPQPTRNSSRNLGPL
at org.snpeff.interval.Transcript.getFirstCodingExon(Transcript.java:1136)
at org.snpeff.interval.Transcript.frameCorrectionFirstCodingExon(Transcript.java:909)
at org.snpeff.interval.Transcript.frameCorrection(Transcript.java:878)
at org.snpeff.snpEffect.factory.SnpEffPredictorFactory.frameCorrection(SnpEffPredictorFactory.java:596)
at org.snpeff.snpEffect.factory.SnpEffPredictorFactory.finishUp(SnpEffPredictorFactory.java:545)
at org.snpeff.snpEffect.factory.SnpEffPredictorFactoryGff.create(SnpEffPredictorFactoryGff.java:348)
at org.snpeff.snpEffect.commandLine.SnpEffCmdBuild.run(SnpEffCmdBuild.java:369)
at org.snpeff.SnpEff.run(SnpEff.java:1183)
at org.snpeff.SnpEff.main(SnpEff.java:162)
java.lang.RuntimeException: Error reading file '/public/jychu/zhengxt/ovis_aries_genome/VCF/cjy_result/soft/snpEff/./data/Oar_rambouillet_v1.0/genes.gtf'
java.lang.RuntimeException: Error: Cannot find first coding exon for transcript:
PEKD01005337.1:-2277-18238, strand: -, id:ENSOART00020027463, bioType:protein_coding, Protein
5'UTR : PEKD01005337.1 18068-18238 UTR_5_PRIME 'UTR5_PEKD01005337.1_18069_18239'
如果发现此错误,则意味着gtf文件中有一些基因,而fasta文件中没有
因此,我们只需要在gtf文件中删除该基因即可
sed -i "/ENSOART00020027463/d" genes.gtf
- 运行成功 okkkkkk
00:02:14 [Optional] Rare amino acid annotations
00:02:14 Warning: Cannot read optional protein sequence file '/public/jychu/zhengxt/ovis_aries_genome/VCF/cjy_result/soft/snpEff/./data/Oar_rambouillet_v1.0/protein.fa', nothing done.
00:02:14 Saving database
00:02:44 [Optional] Reading regulation elements: GFF
00:02:44 Warning: Cannot read optional regulation file '/public/jychu/zhengxt/ovis_aries_genome/VCF/cjy_result/soft/snpEff/./data/Oar_rambouillet_v1.0/regulation.gff', nothing done.
00:02:44 [Optional] Reading regulation elements: BED
00:02:44 Cannot find optional regulation dir '/public/jychu/zhengxt/ovis_aries_genome/VCF/cjy_result/soft/snpEff/./data/Oar_rambouillet_v1.0/regulation.bed/', nothing done.
00:02:44 [Optional] Reading motifs: GFF
00:02:44 Warning: Cannot open PWMs file /public/jychu/zhengxt/ovis_aries_genome/VCF/cjy_result/soft/snpEff/./data/Oar_rambouillet_v1.0/pwms.bin. Nothing done
00:02:44 Done
00:02:44 Logging
00:02:45 Checking for updates...
00:02:46 Done.
-
确认一下
#如果成功那么在Oar_rambouillet_v1.0目录下会有一个".bin"文件产生
(base) [jychu@localhost snpEff]$ cd data/Oar_rambouillet_v1.0/
(base) [jychu@localhost Oar_rambouillet_v1.0]$ ls
genes.gtf snpEffectPredictor.bin