安装基因组注释流程braker2

github链接

https://github.com/Gaius-Augustus/BRAKER

参考链接

https://www.jianshu.com/p/e6a5e1f85dda

github主页上的安装流程看着还挺麻烦的

参考链接里提到可以用conda安装

我就试着用conda安装

新建一个虚拟环境

conda create -n braker2
conda activate braker2
conda install braker2

这一步卡了一晚上也没有成功，看到有人说可以用mamba试一下

mamba install braker2

这个可以，但是遇到了一个报错

Could not solve for environment specs
Encountered problems while solving:

package libmambapy-1.3.0-py311h1f88262_0 requires openssl >=3.0.7,<4.0a0, but none of the providers can be installed

我这个虚拟环境下的python是3.11，我换成3.10以后就没有这个报错了

conda install python=3.10 -y

然后再重新安装

mamba install braker2

安装结束以后会提示

The config/ directory from AUGUSTUS can be accessed with the variable AUGUSTUS_CONFIG_PATH.
BRAKER2 requires this directory to be in a writable location, so if that is not the case, copy this directory to a writable location, e.g.:
cp -r /mnt/shared/scratch/myan/apps/mingyan/Biotools/mambaforge/envs/braker2/config/ /absolute_path_to_user_writable_directory/
export AUGUSTUS_CONFIG_PATH=/absolute_path_to_user_writable_directory/config

Due to license and distribution restrictions, GeneMark and ProtHint should be additionally installed for BRAKER2 to fully work.
These packages can be either installed as part of the BRAKER2 environment, or the PATH variable should be configured to point to them.
The GeneMark key should be located in /home/myan/.gm_key and GENEMARK_PATH should include the path to the GeneMark executables.


                                                                                                                                  done

接下来还得手动配置GeneMark

在这个链接 http://exon.gatech.edu/GeneMark/license_download.cgi

image.png

这边kernel 2.6 和3.10我暂时搞不清楚有啥区别，我选择的是3.10

image.png

需要填这些信息，然后同意获得下载链接，

关于GeneMark的配置，可以参考 https://github.com/Gaius-Augustus/BRAKER

image.png

尤其是最后一步,不知道为啥如果不运行这一步，不会调用conda环境中的perl，然后就一直报错

在 ~/.bashrc 文件里添加

export GENEMART_PATH=/home/myan/biotools/gmes_linux_64_4

然后进入gmes_linux_64_4这个文件夹

运行

perl change_path_in_perl_scripts.pl "/home/myan/anaconda3/envs/braker2/perl"

或者按照这个链接的做法
https://www.jianshu.com/p/8ac6a884c3c1

sed -i "s/\/usr\/bin\/perl/\/home\/myan\/anaconda3\/envs\/braker2\/bin\/perl/g" ./*.pl

这个起到额效果可能和change_path_in_perl_scripts.pl 这个perl脚本差不多

这应该就配置好了

然后运行注释命令

braker.pl --genome=at_chr1.fa --bam=SRR4420293.sorted.bam

我这个只是拟南芥一条染色体的数据，如果需要这个数据可以给我留言

这一步需要好长时间，

正常第一步应该先屏蔽重复序列，这里我忘记做这一步了

更新20230313

以上操作运行成功了，但是用其他数据的时候又报错了，暂时不知道咋搞了

报错信息

The most common problem is an expired or not present file ~/.gm_key!

重新下载gmes_linux_64_4和对应的key，然后用sed替换gmes_linux_64_4中perl脚本第一行的内容那个#/usr/bin/perl

之前一直没搞懂在一些脚本上的开头是一行都会写 #/usr/bin/perl 这个内容，它到底能起到什么作用

今天通过改这个猜测写完脚本，如果变成可执行文件，就会调用开头第一行的解释器

比如我在python脚本里第一行写了/usr/bin/python2

即使我当前环境下是python3,这个环境也不会起作用（写的有点乱，自己大概理解是什么意思）

或者按照这个链接的做法
https://www.jianshu.com/p/8ac6a884c3c1

sed -i "s/\/usr\/bin\/perl/\/home\/myan\/anaconda3\/envs\/braker2\/bin\/perl/g" ./*.pl

这个起到额效果可能和change_path_in_perl_scripts.pl 这个perl脚本差不多

参考这个

https://xuzhougeng.top/archives/genome-annotation-with-braker2

关于conda的一个新的知识点

不是一定要用conda activate 启动环境，才能调用命令，你其实可以调用某个环境的给定指令

conda run -n rna-seq STAR --help

参考 https://xuzhougeng.top/archives/activate-conda-environment-in-shell-script

安装基因组注释流程braker2

github链接

参考链接

新建一个虚拟环境

然后再重新安装

安装结束以后会提示

接下来还得手动配置GeneMark

更新20230313

推荐阅读更多精彩内容