一、自行下载安装:
进入kobas官网点击下载,对应的安装包和数据库。
二、conda 安装(推荐)
conda install -c bioconda kobas==3.0.3
conda 安装 kobas时报错:python需要用2.7版本
UnsatisfiableError: The following specifications were found
to be incompatible with the existing python installation in your environment:
Specifications:
- kobas==3.0.3 -> python[version='2.7.*|<3|>=2.7,<2.8.0a0']
Your python: python=3.8
If python is on the left-most side of the chain, that's the version you've asked for.
When python appears to the right, that indicates that the thing on the left is somehow
not available for the python version you are constrained to. Note that conda will not
change your python version to a different minor version unless you explicitly specify
that.
The following specifications were found to be incompatible with your system:
- feature:/linux-64::__glibc==2.28=0
- feature:|@/linux-64::__glibc==2.28=0
Your installed version is: 2.28
重新构建conda环境,采用python 2.7版本即可
conda create -n kobas python=2.7
#使用activate激活环境直接安装
conda activate kobas
conda install -c bioconda kobas==3.0.3
三、配置kobas
如果没有配置直接运行,会出现以下报错。
(kobas) [user@localhost /data/pipeline/get_Enrichment]$kobas-annotate
Error: configuration file does not exist. Please create the file or provide the options through command line parameters.
Usage: kobas-annotate [-l] -i infile [-t intype] -s species [-o outfile] [-e evalue] [-r rank] [-n nCPUs] [-c coverage] [-z ortholog] [-k kobas_home] [-v blast_home] [-y blastdb] [-q kobasdb] [-p blastp] [-x blastx]
kobas-annotate: error: Option -i must be assigned.
1、下载数据库
下载:sqlite3.tar.gz和seq_pep.tar.gz
链接:ftp://ftp.cbi.pku.edu.cn/pub/KOBAS_3.0_DOWNLOAD/
2、配置~/.kobasrc文件
将上面下载的数据量路径和blast的路径写进去
[KOBAS]
kobasdb = /data/database/kobas/sqlite3
[BLAST]
blastp = /data/pipeline/scRNA/miniconda/envs/kobas/bin/blastp
blastx = /data/pipeline/scRNA/miniconda/envs/kobas/bin/blastx
blastdb = /data/database/kobas/seq_pep/
四、测试运行
(kobas) [user@localhost /data/pipeline/get_Enrichment]$kobas-run
Usage: run_kobas.py [-l] -i infile [-t intype] -s species [-E evalue] [-R rank] [-N nCPUs] [-C coverage] [-Z ortholog] [-b bgfile] [-d database] [-m method] [-n fdr] [-o outfile] [-c cutoff] [-k kobas_home] [-v blast_home] [-y blastdb] [-q kobasdb] [-p blastp] [-x blastx]
Options:
-h, --help show this help message and exit
-l, --list list available species, or list available databases
for specific species
-i INFILE, --infile=INFILE
input data file
-t INTYPE, --intype=INTYPE
input type (fasta:pro, fasta:nuc, blastout:xml,
blastout:tab, id:refseqpro, id:uniprot, id:ensembl,
id:ncbigene), default fasta:pro
-s SPECIES, --species=SPECIES
species abbreviation (for example: ko for KEGG
Orthology, hsa for Homo sapiens, mmu for Mus musculus,
dme for Drosophila melanogaster, ath for Arabidopsis
thaliana, sce for Saccharomyces cerevisiae and eco for
Escherichia coli K-12 MG1655)
-E EVALUE, --evalue=EVALUE
expect threshold for BLAST, default 1e-5
-R RANK, --rank=RANK rank cutoff for valid hits from BLAST result, default
5
-N NCPUS, --nCPUs=NCPUS
number of CPUs to be used by BLAST, default 1
-C COVERAGE, --coverage=COVERAGE
subject coverage cutoff for BLAST, default 0
-Z ORTHOLOG, --ortholog=ORTHOLOG
whether only use ortholog for cross-species annotation
or not, default NO (If only use ortholog, give species
abbr)
-b BGFILE, --bgfile=BGFILE
background file, the output of annotate (3 or 4-letter
file name is not allowed), or species abbreviation
(for example: hsa for Homo sapiens, mmu for Mus
musculus, dme for Drosophila melanogaster, ath for
Arabidopsis thaliana, sce for Saccharomyces cerevisiae
and eco for Escherichia coli K-12 MG1655), default
same species as annotate
-d DB, --db=DB databases for selection, 1-letter abbreviation
separated by "/": K for KEGG PATHWAY, R for Reactome,
B for BioCyc, p for PANTHER, o for OMIM, k for KEGG
DISEASE, N for NHGRI GWAS Catalog and G for Gene
Ontology, S for Gene Ontology Slim, default
K/R/B/p/o/k/N/G/S
-m METHOD, --method=METHOD
choose statistical test method: b for binomial test, c
for chi-square test, h for hypergeometric test /
Fisher's exact test, and x for frequency list, default
hypergeometric test / Fisher's exact test
-n FDR, --fdr=FDR choose false discovery rate (FDR) correction method:
BH for Benjamini and Hochberg, BY for Benjamini and
Yekutieli, QVALUE, and None, default BH
-o OUTFILE, --outfile=OUTFILE
output file for identification result, default stdout
-c CUTOFF, --cutoff=CUTOFF
the gene number in a term is not less than the cutoff,
default 5
-k KOBAS_HOME, --kobashome=KOBAS_HOME
Optional parameter. To set path to kobas_home, which
is parent directory of sqlite3/ and seq_pep/ , default
value is read from ~/.kobasrcwhere you set before
running kobas. If you set this parameter, it means you
set "kobasdb" and "blastdb" in this following
directory. e.g. "-k /home/user/kobas/", means that you
set kobasdb = /home/user/kobas/sqlite3/ and blastdb =
/home/user/kobas/seq_pep/
-v BLAST_HOME, --blasthome=BLAST_HOME
Optional parameter. To set parent directory of blastx
and blastp. If you set this parameter, it means you
set "blastx" and "blastp" in this following directory.
Default value is read from ~/.kobasrc where you set
before running kobas
-y BLASTDB, --blastdb=BLASTDB
Optional parameter. To set path to sep_pep/, default
value is read from ~/.kobasrc where you set before
running kobas
-q KOBASDB, --kobasdb=KOBASDB
Optional parameter. To set path to sqlite3/, default
value is read from ~/.kobasrc where you set before
running kobas, e.g. "-q /kobas_home/sqlite3/"
-p BLASTP, --blastp=BLASTP
Optional parameter. To set path to blastp program,
default value is read from ~/.kobasrc where you set
before running kobas
-x BLASTX, --blastx=BLASTX
Optional parameter. To set path to blasx program,
default value is read from ~/.kobasrc where you set
before running kobas