MEGAN使用

参考:宏基因组注释和可视化神器MEGAN入门_刘永鑫的博客——宏基因组公众号-CSDN博客

对PE数据的两个fq文件分别跑blastx:

/biostack/tools/alignment/diamond-2.0.4/diamond blastx -c 1 --db /biostack/database/nr/db/nr.dmnd -t tmp1 -p 24 -q all_other.R1.fq.gz --daa diamond-C1.1.daa >diamond.log1 &

/biostack/tools/alignment/diamond-2.0.4/diamond blastx -c 1 --db /biostack/database/nr/db/nr.dmnd -t tmp2 -p 24 -q all_other.R2.fq.gz --daa diamond-C1.2.daa >diamond.log2 &

注意:tmp1和tmp2目录要先建立。这一步会跑很久。

对daa文件进行转化为MEGAN需要的rma文件:

/biostack/tools/microbiome/MEGAN_Community-6.19.2/tools/daa2rma -i diamond-C1.1.daa diamond-C1.2.daa --paired -ms 50 -me 0.01 -top 50 -mdb /biostack/database/megan/megan-map-May2020.db -o diamond-C1.rma


输出为以下:
Version MEGAN Community Edition (version 6.19.2, built 17 Jun 2020)

Author(s) Daniel H. Huson

Copyright (C) 2020 Daniel H. Huson. This program comes with ABSOLUTELY NO WARRANTY.

Functional classifications to use: EGGNOG, GTDB, INTERPRO2GO, SEED

Loading ncbi.map: 2,249,459

Loading ncbi.tre: 2,249,463

Loading eggnog.map:    30,875

Loading eggnog.tre:    30,986

Loading gtdb.map:  182,187

Loading gtdb.tre:  182,191

Loading interpro2go.map:    12,738

Loading interpro2go.tre:    28,689

Loading seed.map:      978

Loading seed.tre:      979

In DAA files: diamond-C1.1.daa, diamond-C1.2.daa

Output file:  diamond-C1.rma

Classifications: Taxonomy, SEED, EGGNOG, GTDB, INTERPRO2GO

Generating RMA6 file Parsing matches

Annotating RMA6 file using FAST mode (accession database and first accession per line)

Parsing file diamond-C1.1.daa

Parsing file: diamond-C1.1.daa

10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (2397.7s)

Parsing file diamond-C1.2.daa

Parsing file: diamond-C1.2.daa

10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (2336.6s)

Total reads:        18,425,737

Alignments:        389,875,476

100% (0.0s)

100% (0.0s)

Linking paired reads

Number of pairs:            0

Binning reads: Initializing...

Initializing binning...

Using paired reads in taxonomic assignment...

Using 'Naive LCA' algorithm for binning: Taxonomy

Using Best-Hit algorithm for binning: SEED

Using Best-Hit algorithm for binning: EGGNOG

Using 'Naive LCA' algorithm for binning: GTDB

Using Best-Hit algorithm for binning: INTERPRO2GO

Binning reads...

Binning reads: Analyzing alignments

Total reads:      18,425,737

With hits:          18,425,737

Alignments:        389,875,476

Assig. Taxonomy:    18,366,746

Assig. SEED:        11,106,599

Assig. EGGNOG:      11,487,509

Assig. GTDB:        17,662,177

Assig. INTERPRO2GO: 10,358,118

MinSupport set to: 9212

Binning reads: Applying min-support & disabled filter to Taxonomy...

Min-supp. changes:      8,122

Binning reads: Applying min-support & disabled filter to GTDB...

Min-supp. changes:      20,690

Binning reads: Writing classification tables

Numb. Tax. classes:        133

Numb. SEED classes:        750

Numb. EGG. classes:      7,456

Numb. GTDB classes:        104

Numb. INT. classes:      8,954

Binning reads: Syncing

Class. Taxonomy:          133

Class. SEED:              750

Class. EGGNOG:          7,456

Class. GTDB:              104

Class. INTERPRO2GO:      8,954

100% (19792.7s)

Total time:  24,536s

Peak memory: 144.1 of 195.3 G

耗时差不多7个小时。

提取物种注释数据:

/biostack/tools/microbiome/MEGAN_Community-6.19.2/tools/rma2info -i diamond-C1.rma -c2c Taxonomy -r2c Taxonomy -n true --paths true --ranks true --list true --listMore true --bacteriaOnly true -v > C1Taxonomy1.txt

屏幕输出:

RMA2Info - Analyses an RMA file

Options:

Input and Output

        --in: diamond-C1.rma

        --out: -

Commands

        --list: true

        --listMore: true

        --class2count: Taxonomy

        --read2class: Taxonomy

        --names: true

        --paths: true

        --ranks: true

        --majorRanksOnly: false

        --bacteriaOnly: true

        --virusOnly: false

        --ignoreUnassigned: true

Other:

        --verbose: true

Version  MEGAN Community Edition (version 6.19.2, built 17 Jun 2020)

Author(s) Daniel H. Huson

Copyright (C) 2020 Daniel H. Huson. This program comes with ABSOLUTELY NO WARRANTY.

Loading MEGAN File: diamond-C1.rma

Loading ncbi.map: 2,249,459

Loading ncbi.tre: 2,249,463

Total time: 492s

Peak memory: 29.8 of 195.3 G

提取EGGNOG注释:

/biostack/tools/microbiome/MEGAN_Community-6.19.2/tools/rma2info -i diamond-C1.rma -r2c EGGNOG -n true --paths true --ranks true --list true --listMore true -v > C1eggnog.txt

提取SEED注释:

/biostack/tools/microbiome/MEGAN_Community-6.19.2/tools/rma2info -i diamond-C1.rma -r2c SEED -n true --paths true --ranks true --list true --listMore true -v > C1SEED.txt

提取INTERPRO2GO注释:

/biostack/tools/microbiome/MEGAN_Community-6.19.2/tools/rma2info -i diamond-C1.rma -r2c INTERPRO2GO -n true --paths true --ranks true --list true --listMore true -v > C1INTERPRO2GO.txt

下载MEGAN6 Community Edition installers:https://software-ab.informatik.uni-tuebingen.de/download/megan6/welcome.html

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容