已经有发文Metabolic interaction models recapitulate leaf microbiota ecology 研究论文。
亮点: 不同生物的genome scale model + 生态位猜测 == 获取物种互作信息
书写基因组代谢 模型算法部分:
eg: 作者发现微生物在碳代谢方面存在有竞争,因此当不给菌株添加碳时,菌株也可以生长成这样的状态,可被氨基酸和有机酸摄取抵消,用的人工检查和图像处理技术,看碳利用能力和系统发育偶然模型,根据碳利用效率看菌株生态位重叠程度,生态型重叠指数用NOI来测算。 基本上面可以看到所有菌株根瘤菌株NOI高
另外为什么叫基因组代谢模型?菌代谢能力和生理特征用大约5000反应和相应菌株genome 大小适度相关。
PART1 :模型产生部分
*At*-LSPHERE genome-scale metabolic model generation pipeline
This collection of scripts will output a set of curated metabolic models based on organism genomes and experimental information. It is divided into four subsections:
(1) generation of draft models using CarveMe (Machado *et al.*, 2018),
(2) initial gapfilling of the draft models using NICEgame (Vayena *et al.*, 2022),
(3) Additional gapfilling of the models to resolve false positive and negative reactions, and
(4) final model formatting and annotation, followed by verification using MEMOTE (Lieven *et al.*, 2020). This guide is based on a recommended folder structure for storing models and reports.
# Local quickstart
Software requirements:
* [MATLAB](https://www.mathworks.com/products/matlab.html) R2021a or higher
* [CarveMe](https://carveme.readthedocs.io/en/latest/installation.html)
* [Python](https://www.python.org) 3.6 or 3.7
* [COBRA Toolbox](https://opencobra.github.io/cobratoolbox/stable/) v2.24.3 or higher
* [IBM CPLEX Solver](https://www.ibm.com/products/ilog-cplex-optimization-studio/cplex-optimizer) v12.10
* NICEgame (from this repository)
* [MEMOTE](https://memote.readthedocs.io/en/latest/)
## Generate draft metabolic reconstructions using CarveMe:
1. Download all desired genomes (in this repo, these are in 'Models/Genomes/'):
2. Using a command line interface, navigate to the CarveMe installation directory and initialize the software:
$ python3 /Applications/carveme-master/carveme/__init__.py
3. To generate models for all genomes in a directory, navigate to the directory in which the genomes are stored (i.e., 'Models/Genomes/') and run:
for infile in *.faa.zip; do
outfile=$(echo $infile | awk -F'[.]' '{print $1}')
carve $infile -o "../CarveMe/sbml_noGF/$outfile.xml
This will create one SBML draft model corresponding to each genome, and will store them in the 'sbml_noGF' directory.
Alternatively, to generate models for individual genomes, navigate to desired directory and run:
carve --refseq GCF_XXXXXXXXX.1 -o ../CarveMe/sbml_noGF/GCF_XXXXXXXXX.xml
**Key outputs:**
* One draft genome-scale model (in SBML format) for each input genome
## Generate gapfilled models using NICEgame:
**Main script:**
* Gapfilling/NICEgame/gapFillModelTFA.m
**Key inputs:**
* Draft models (in 'FBA/Models/CarveMe/sbml_noGF/')
* Carbon source screen data ('Medium/CSourceScreen_Jul2022.xlsx')
1. Unpack the matTFA toolbox located in NICEgame/matTFA-master/matTFA.zip
2. Open MATLAB and the 'gapFillModelTFA.m' script. This script generates genome-scale metabolic models from previously-generated CarveMe reconstructions and experimental data using the matTFA (Thermodynamic Flux Analysis, Salvy *et al.*, 2019) and NICEgame (Vayena *et al.*, 2022) pipelines.
This script takes a CarveMe draft metabolic model of an organism and its corresponding experimental data (in .xlsx format representing growth/no growth on carbon sources) as its main inputs. It performs gapfilling using NICEgame and matTFA, which merge the corresponding draft model with a universal metabolite/reaction database and constrains reactions using thermodynamic information. NICEgame then finds candidate reactions that need to be added to the reconstructions to enable growth on each carbon source.
The script then selects the best combination of gapfilled reactions to use by predicting the growth/no growth phenotype of each model on combinations of solutions. It then saves COBRA model files for downstream curation.
**Key outputs:**
* List of candidate reactions for gapfilling (in 'FBA/Models/NICEgame/GapfillingResults/')
* Gapfilled models (one per organism, in 'FBA/Models/NICEgame/Gapfilled/')
## Perform additional model curation to resolve false negative and positive growth:
**Main scripts:**
* Gapfilling/getModelAccuracy.m
* Gapfilling/troubleshootFalsePosNeg.m
**Key inputs:**
* Gapfilled models (one per organism, in 'FBA/Models/NICEgame/Gapfilled/')
* Carbon source screen data ('Medium/CSourceScreen_Jul2022.xlsx')
1. Run the 'getModelAccuracy.m' script, which will output a .mat file containing accuracy statistics of all models in the relevant directory.
2. Run the 'troubleshootFalsePosNeg.m' script, which will reference other models within the collection to correct for false negative and positive growth predictions. Here, the threshold for false positives and the method of correction can be adjusted.
**Key outputs:**
* FP/FN-corrected gapfilled models (one per organism, in 'FBA/Models/NICEgame/Gapfilled/FPFNCorrected/')
## Perform final model formatting:
**Main scripts:**
* Final/finalModelFormatting.m
**Key inputs:**
* FP/FN-corrected gapfilled models (one per organism, in 'FBA/Models/NICEgame/Gapfilled/FPFNCorrected/')
* Annotation databases (in 'FBA/Scripts/ModelGeneration/Final/databases/')
1. Run the 'finalModelFormatting.m' script, which will attempt to annotate all model metabolites, genes, reactions, and subsystems. It will output a .mat file containing the formatted model in COBRA format, as well as an SBML model in .xml.
**Key outputs:**
* Annotated models in .mat format (one per organism, in 'FBA/Models/Final/')
* Annotated models in SBML format (one per organism, in 'FBA/Models/Final/sbml')
## Verify models using MEMOTE:
**Key inputs:**
* Annotated models in SBML format (in 'FBA/Models/Final/sbml')
1. Navigate to the directory containing the gapfilled models in SBML format and run MEMOTE via a command line interface to verify the models:
for i in *.xml; do
memote report snapshot --filename "../../Reports/${i%.*}.html" "$i" || break
**Key outputs:**
* MEMOTE quality scores for each model (in 'FBA/Models/Reports/')
*At*-LSPHERE genome-scale metabolic model simulation scripts
These scripts will simulate competitive outcomes between previously-generated genome-scale models, and will compare these outcomes to experimental data. This guide is based on a recommended folder structure for storing models, but can be modified in each script.
# Local quickstart
Software requirements:
* [MATLAB](https://www.mathworks.com/products/matlab.html) R2021a or higher
* [COBRA Toolbox](https://opencobra.github.io/cobratoolbox/stable/) v2.24.3 or higher
* [IBM CPLEX Solver](https://www.ibm.com/products/ilog-cplex-optimization-studio/cplex-optimizer) v12.10
## Compute competitive outcomes and compare to experimental data:
**Main script:**
* competitiveOutcomesPairs.m
**Key inputs:**
* Curated models (in 'Models/Final/')
* Medium composition ('Medium/minMedCSourceScreen.mat')
1. Open MATLAB and the 'competitiveOutcomesPairs.m' script. This script computes competitive outcomes between strain pairs and community compositions, and compares them to experimental outcomes if desired.
**Key outputs:**
* Pairwise and community competitive outcomes and associated metabolic flux information
确定生物量组成的基本方法有两种。最常见的是包含所有生物质前体的单一集总反应。或者,生物量方程可以分成几个反应,每个反应都关注不同的大分子成分,例如a (1 gDW灰)+ b (1 gDW磷脂)+ c(游离脂肪酸)+ d (1 gDW碳水化合物)+ e (1 gDW蛋白质)+ f (1 gDW RNA) + g (1 gDW DNA) + h(维生素/辅因子)+ xATP + xH2O-> 1 gDCW生物量+ xADP + xH + xPi。这两种方法的好处在很大程度上取决于所使用的用例