官网:http://provean.jcvi.org/index.php
1、从README(http://provean.jcvi.org/downloads/README)可知,需要下载以下软件和数据库
(1) NCBI BLAST 2.4.0
Index of /blast/executables/blast+/2.4.0
(2) CD-HIT 3.1.2 (or more recent, but currently v4.6 and v4.6.1 are not recommended since those versions have a reported problem,
https://code.google.com/p/cdhit/issues/detail?id=18)
[Releases · weizhongli/cdhit](https://github.com/weizhongli/cdhit/releases)找到适合的版本下载
wget -c https://github.com/weizhongli/cdhit/releases/download/V4.8.1/cd-hit-v4.8.1-2019-0228.tar.gz
tar -zxvf cd-hit-v4.8.1-2019-0228.tar.gz
cd cd-hit-v4.8.1-2019-0228
make
cp cd-hit ~/bin
(3) NCBI nr (non-redundant) protein database
#!/bin/bash
for I in {000..116}
do
wget https://ftp.ncbi.nih.gov/blast/db/nr.${I}.tar.gz
wget https://ftp.ncbi.nih.gov/blast/db/nr.${I}.tar.gz.md5
done
#tar解压
2、下载PROVEAN,wget无法访问网址,需挂梯子,也可以下载到电脑上,再传到服务器中
3、安装PROVEAN
tar -zxvf provean-1.1.5.tar.gz
cd provean-1.1.5
./configure BLAST_DB=/path/to/blast/database/nr #nr数据库所在路径 / 数据库名称(很重要)
make #报错,再次make即可
make install #提示无权限,加sudo解决
provean.sh -h #成功安装
PROVEAN v1.1.5
USAGE:
provean.sh [Options]
Example:
# Given a query sequence in aaa.fasta file,
# compute scores for variations in bbb.var file
provean.sh -q aaa.fasta -v bbb.var
Required arguments:
-q <string>, --query <string>
Query protein sequence filename in fasta format
-v <string>, --variation <string>
Variation filename containing a list of variations:
one entry per line in HGVS notation,
e.g.: G105C, F508del, Q49dup, Q49_P50insC, Q49_R52delinsLI
Optional arguments:
--save_supporting_set <string>
Saves supporting sequence set infomation into a given filename
--supporting_set <string>
Supporting sequence set filename saved with '--save_supporting_set' option above
(This will save time for BLAST search and clustering.)
--tmp_dir <string>
Temporary directory used to store temporary files
--num_threads <integer>
Number of threads (CPUs) to use in BLAST search
-V, --verbose
Verbosely shows the information about procedure
-h, --help
Gives this help message