第一次体会到文献鸟的好处,给我推送了一篇bioinformatics的文章
https://doi.org/10.1093/bib/bbab538](https://doi.org/10.1093/bib/bbab538
https://github.com/YanCCscu/meangs
功能是“The MEANGS is a seed-free software that applies trie-search to extend contigs from self-discovery seeds and assemble mitogenome, from NGS data.”即组装线粒体。
目前MEANGS (v1.0)只支持paired-end data
1.安装和使用
git clone https://github.com/YanCCscu/MEANGS.git
cd MEANGS
./meangs.py --silence -1 1.fq.gz -2 2.fq.gz -o Out -t 16 -i 300 --species_class Arthropoda
-i 是library insert length
-o 就是指定输出文件夹,${prefix}_deep_detected_mito.fas就是组装好的线粒体
--species_class是指定组装的物种,具体有哪些见下文参数说明
这个软件自带注释,注释用的是mitos2,mitos2有在线网页版直接百度就搜得到,但个人亲测mitos1似乎有些时候更好用,,具体用法就是把fasta文件上传上去即可。
亲测,这样注释出来的CDS会有丢失起始密码子和终止密码子的情况,NCBI上找几个近缘物种手动改一下就行
mitos1/2网址
http://mitos.bioinf.uni-leipzig.de/index.py
http://mitos2.bioinf.uni-leipzig.de/index.py
好像是把测序文件比对到这些数据库上,基于此来组装,这样的话估计只能组装这些物种
我的命令行
./meangs.py -1 1.fq.gz -2 2.fq.gz -o out_deep -t 16 -i 300 --species_class Arthropoda --deepin
3.参数说明
usage: meangs.py [-h] [-1 FQ1] [-2 FQ2] [-o OUTBASE] [-t THREADS] [-i INSERT]
[-q QUALITY] [-n NSAMPLE] [-s SEQSCAF]
[--species_class {A-worms,Arthropoda,Bryozoa,Chordata,Echinodermata,Mollusca,Nematoda,N-worms,Porifera-sponges}]
[--deepin] [--clip] [--keepIntMed] [--keepMinLen KEEPMINLEN]
[--skipassem] [--skipqc] [--skiphmm] [--skipextend]
[--silence]
optional arguments:
-h, --help show this help message and exit
-1 FQ1, --fq1 FQ1 Input paired end _1.fq[.gz] files,seprated by ','
-2 FQ2, --fq2 FQ2 Input paired end _2.fq[.gz] files,seprated by ','
-o OUTBASE, --outBase OUTBASE
Output prefix of dir and files
-t THREADS, --threads THREADS
Analysis threads
-i INSERT, --insert INSERT
library insert length
-q QUALITY, --quality QUALITY
Threshold value for low base quality
-n NSAMPLE, --nsample NSAMPLE
Number of reads sampled from input reads, default 0
(keep all reads)
-s SEQSCAF, --seqscaf SEQSCAF
specific a sequences files(fasta) just for annotation
--species_class {A-worms,Arthropoda,Bryozoa,Chordata,Echinodermata,Mollusca,Nematoda,N-worms,Porifera-sponges}
taxon of species belong to
--deepin run deeper mode to assembly mitogenome
--clip detect circle clip point for mitogenome
--keepIntMed keep the intermediate files
--keepMinLen KEEPMINLEN
Threshold of reads length to keep after remove low
quality bases
--skipassem skip the process of assembly
--skipqc skip the process of QC
--skiphmm skip the process of hmmer
--skipextend skip the process of extend in deepin mode
--silence run the program in silence mode, the standard output
will redirect to specific log file
Example:
#run meangs in a quick mode with paird-end library of insert size 350bp, 16 threads are called.
meangs.py --silence -1 1.fq.gz -2 2.fq.gz -o OutBase -t 16 -i 350
#run meangs in a 'deepin mode' the first 2000000 reads in both input fastq files will be used the construct mito-genome
meangs.py -1 R1.fastq.gz -2 R2.fastq.gz -o A3 -t 16 -n 2000000 -i 300 --deepin
关于输出
All output files were stored in one directory assigned by the -o option.
The prefix_deep_detected_mito.fas is the finally assembled mitochondrial genome, Genes in mitochondrail genome is annotated automatically and stored in the file prefix_hmmout_tbl_sorted.gff