Reconstructing High-Quality Large-Scale Metabolic Models with merlin

Dias, Oscar; Rocha, Miguel; Ferreira, Eugénio Campos; Rocha, Isabel

doi:10.1007/978-1-4939-7528-0_1

Oscar Dias³,
Miguel Rocha³,
Eugénio Campos Ferreira³ &
…
Isabel Rocha³

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1716))

2303 Accesses
7 Citations
3 Altmetric

Abstract

Here, the basic principles of reconstructing genome-scale metabolic models with merlin are described. This tool covers the basic stages of this process, providing several tools that allow assembling models, using the sequenced genome as a starting point.

merlin has two main modules, separating the process of annotating (enzymes, transporters, and compartments) on the genome from the process of model assembly, though information from the former is integrated in the latter after curation. Moreover, merlin provides several tools to curate the model, including tools for generating reactions’ gene rules and placeholder entities for biomass precursors, such as proteins (e-protein) or nucleotides (e-DNA and e-RNA) among others.

This tutorial covers each feature of merlin in detail, including the assessment of experimental data for the validation of the model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Otero JM, Nielsen J (2010) Industrial systems biology. Biotechnol Bioeng 105:439–460. https://doi.org/10.1002/bit.22592
Article CAS PubMed Google Scholar
Kitano H (2002) Systems biology: a brief overview. Science 295:1662–1664. https://doi.org/10.1126/science.1069492
Article CAS PubMed Google Scholar
Dias O, Rocha I (2015) Systems biology in fungi. In: Paterson R (ed) Mol. Biol. Food water borne mycotoxigenic mycotic fungi. CRC Press, Boca Raton, FL, pp 69–92
Google Scholar
gismo Meaning in the Cambridge English Dictionary. http://dictionary.cambridge.org/dictionary/english/gismo#translations. Accessed 13 Apr 2017
Google Scholar
Gizmo definition and meaning | Collins English Dictionary. https://www.collinsdictionary.com/dictionary/english/gizmo. Accessed 13 Apr 2017
Google Scholar
Thiele I, Palsson BØ (2010) A protocol for generating a high-quality genome-scale metabolic reconstruction. Nat Protoc 5:93–121. https://doi.org/10.1038/nprot.2009.203
Article CAS PubMed PubMed Central Google Scholar
Dias O, Rocha M, Ferreira EC, Rocha I (2015) Reconstructing genome-scale metabolic models with merlin. Nucleic Acids Res 43:3899–3910. https://doi.org/10.1093/nar/gkv294
Article CAS PubMed PubMed Central Google Scholar
Henry CS, DeJongh M, Best AA, Frybarger PM, Linsay B, Stevens RL (2010) High-throughput generation, optimization and analysis of genome-scale metabolic models. Nat Biotechnol 28:977–982. https://doi.org/10.1038/nbt.1672
Article CAS PubMed Google Scholar
Hucka M, Finney A, Sauro HM, Bolouri H, Doyle JC, Kitano H, Arkin AP, Bornstein BJ, Bray D, Cornish-Bowden A, Cuellar AA, Dronov S, Gilles ED, Ginkel M, Gor V, Goryanin II, Hedley WJ, Hodgman TC, Hofmeyr J-H, Hunter PJ, Juty NS, Kasberger JL, Kremling A, Kummer U, Le Novère N, Loew LM, Lucio D, Mendes P, Minch E, Mjolsness ED, Nakayama Y, Nelson MR, Nielsen PF, Sakurada T, Schaff JC, Shapiro BE, Shimizu TS, Spence HD, Stelling J, Takahashi K, Tomita M, Wagner J, Wang J (2003) The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 19:524–531. https://doi.org/10.1093/bioinformatics/btg015
Article CAS PubMed Google Scholar
Rocha I, Maia P, Evangelista P, Vilaça P, Soares S, Pinto JP, Nielsen J, Patil KR, Ferreira EC, Rocha M (2010) OptFlux: an open-source software platform for in silico metabolic engineering. BMC Syst Biol 4:45. https://doi.org/10.1186/1752-0509-4-45
Article PubMed PubMed Central Google Scholar
Schellenberger J, Que R, Fleming RMT, Thiele I, Orth JD, Feist AM, Zielinski DC, Bordbar A, Lewis NE, Rahmanian S, Kang J, Hyduke DR, Palsson BØ (2011) Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0. Nat Protoc 6:1290–1307. https://doi.org/10.1038/nprot.2011.308
Article CAS PubMed PubMed Central Google Scholar
Le Novère N, Finney A, Hucka M, Bhalla US, Campagne F, Collado-Vides J, Crampin EJ, Halstead M, Klipp E, Mendes P, Nielsen P, Sauro H, Shapiro B, Snoep JL, Spence HD, Wanner BL (2005) Minimum information requested in the annotation of biochemical models (MIRIAM). Nat Biotechnol 23:1509–1515. https://doi.org/10.1038/nbt1156
Article PubMed Google Scholar
Glez-Peña D, Reboiro-Jato M, Maia P, Rocha M, Díaz F, Fdez-Riverola F (2010) AIBench: a rapid application development framework for translational research in biomedicine. Comput Methods Programs Biomed 98:191–203. https://doi.org/10.1016/j.cmpb.2009.12.003
Article PubMed Google Scholar
UniProt Consortium (2015) UniProt: a hub for protein information. Nucleic Acids Res 43:D204–D212. https://doi.org/10.1093/nar/gku989
Article Google Scholar
Boutet E, Lieberherr D, Tognolli M, Schneider M, Bansal P, Bridge AJ, Poux S, Bougueleret L, Xenarios I (2016) UniProtKB/Swiss-Prot, the Manually Annotated Section of the UniProt KnowledgeBase: how to use the entry view. Methods Mol Biol 1374:23–54. https://doi.org/10.1007/978-1-4939-3167-5_2
Article CAS PubMed Google Scholar
Sayers EW, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Edgar R, Federhen S, Feolo M, Geer LY, Helmberg W, Kapustin Y, Landsman D, Lipman DJ, Madden TL, Maglott DR, Miller V, Mizrachi I, Ostell J, Pruitt KD, Schuler GD, Sequeira E, Sherry ST, Shumway M, Sirotkin K, Souvorov A, Starchenko G, Tatusova TA, Wagner L, Yaschenko E, Ye J (2009) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 37:D5–15. https://doi.org/10.1093/nar/gkn741
Article CAS PubMed Google Scholar
Schomburg I, Chang A, Schomburg D (2002) BRENDA, enzyme data and metabolic information. Nucleic Acids Res 30:47–49
Article CAS PubMed PubMed Central Google Scholar
Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M (1999) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 27:29–34. https://doi.org/10.1093/nar/27.1.29
Article CAS PubMed PubMed Central Google Scholar
Lipman DJ, Pearson WRW (1985) Rapid and sensitive protein similarity searches. Science 227:1435–1441. PMID: 2983426
Article CAS PubMed Google Scholar
Federhen S (2012) The NCBI Taxonomy database. Nucleic Acids Res 40:D136–D143. https://doi.org/10.1093/nar/gkr1178
Article CAS PubMed Google Scholar
Kitts PA, Church DM, Thibaud-Nissen F, Choi J, Hem V, Sapojnikov V, Smith RG, Tatusova T, Xiang C, Zherikov A, DiCuccio M, Murphy TD, Pruitt KD, Kimchi A (2016) Assembly: a resource for assembled genomes at NCBI. Nucleic Acids Res 44:D73–D80. https://doi.org/10.1093/nar/gkv1226
Article CAS PubMed Google Scholar
mysql-server - Linux Mint Community. https://community.linuxmint.com/software/view/mysql-server. Accessed 13 Apr 2017
Google Scholar
MySQL :: About MySQL. https://www.mysql.com/about/. Accessed 13 Apr 2017
Google Scholar
Pearson WR (2013) An introduction to sequence similarity (“Homology”) searching. In: Curr. Protoc. Bioinforma. John Wiley & Sons, Inc., Hoboken, NJ, pp 3.1.1–3.1.8
Google Scholar
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410. https://doi.org/10.1016/S0022-2836(05)80360-2
Article CAS PubMed Google Scholar
Finn RD, Clements J, Eddy SR (2011) HMMER web server: interactive sequence similarity searching. Nucleic Acids Res 39:W29–W37. https://doi.org/10.1093/nar/gkr367
Article CAS PubMed PubMed Central Google Scholar
Magrane M, Consortium UP (2011) UniProt Knowledgebase: a hub of integrated protein data. Database. https://doi.org/10.1093/database/bar009
Dias O, Gomes D, Vilaca P, Cardoso J, Rocha M, Ferreira E, Rocha I (2017) Genome-wide semi-automated annotation of transporter systems. IEEE/ACM Trans Comput Biol Bioinforma 14:443. https://doi.org/10.1109/TCBB.2016.2527647
Article Google Scholar
Yu NY, Wagner JR, Laird MR, Melli G, Rey S, Lo R, Dao P, Sahinalp SC, Ester M, Foster LJ, Brinkman FSL (2010) PSORTb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes. Bioinformatics 26:1608–1615. https://doi.org/10.1093/bioinformatics/btq249
Article CAS PubMed PubMed Central Google Scholar
Goldberg T, Hecht M, Hamp T, Karl T, Yachdav G, Ahmed N, Altermann U, Angerer P, Ansorge S, Balasz K, Bernhofer M, Betz A, Cizmadija L, Do KT, Gerke J, Greil R, Joerdens V, Hastreiter M, Hembach K, Herzog M, Kalemanov M, Kluge M, Meier A, Nasir H, Neumaier U, Prade V, Reeb J, Sorokoumov A, Troshani I, Vorberg S, Waldraff S, Zierer J, Nielsen H, Rost B (2014) LocTree3 prediction of localization. Nucleic Acids Res 42:W350–W355. https://doi.org/10.1093/nar/gku396
Article CAS PubMed PubMed Central Google Scholar
Saier MH (2000) A functional-phylogenetic classification system for transmembrane solute transporters. Microbiol Mol Biol Rev 64:354–411
Article CAS PubMed PubMed Central Google Scholar
Sonnhammer EL, von Heijne G, Krogh A (1998) A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol 6:175–182
CAS PubMed Google Scholar
Käll L, Krogh A, Sonnhammer ELL (2004) A combined transmembrane topology and signal peptide prediction method. J Mol Biol 338:1027–1036. https://doi.org/10.1016/j.jmb.2004.03.016
Article PubMed Google Scholar
Moller S, Croning MDR, Apweiler R, Möller S (2001) Evaluation of methods for the prediction of membrane spanning regions. Bioinformatics 17:646–653. https://doi.org/10.1093/bioinformatics/17.7.646
Article CAS PubMed Google Scholar
Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147:195–197. https://doi.org/10.1016/0022-2836(81)90087-5
Article CAS PubMed Google Scholar
Gardy JL, Brinkman FSL (2006) Methods for predicting bacterial protein subcellular localization. Nat Rev Microbiol 4:741–751. https://doi.org/10.1038/nrmicro1494
Article CAS PubMed Google Scholar
Ma H, Zeng A-P (2003) Reconstruction of metabolic networks from genome data and analysis of their global structure for various organisms. Bioinformatics 19:270–277. https://doi.org/10.1093/bioinformatics/19.2.270
Article CAS PubMed Google Scholar
Stelzer M, Sun J, Kamphans T, Fekete SP, Zeng A-P (2011) An extended bioreaction database that significantly improves reconstruction and analysis of genome-scale metabolic networks. Integr Biol (Camb) 3:1071–1086. https://doi.org/10.1039/c1ib00008j
Article CAS Google Scholar
Tanabe M, Kanehisa M (2012) Using the KEGG database resource. Curr Protoc Bioinformatics Chapter 1:Unit1.12. doi: https://doi.org/10.1002/0471250953.bi0112s38
Varma A, Palsson BO (1993) Metabolic capabilities of Escherichia coli II. Optimal growth patterns. J Theor Biol 165:503–522. https://doi.org/10.1006/jtbi.1993.1203
Article CAS Google Scholar
Santos ST (2013) Development of computational methods for the determination of biomass composition and evaluation of its impact in genome-scale models predictions. Universidade do Minho
Google Scholar
Santos S, Rocha I (2016) Estimation of biomass composition from genomic and transcriptomic information. J Integr Bioinform. https://doi.org/10.2390/biecoll-jib-2016-285
Xavier JC, Patil KR, Rocha I (2017) Integration of biomass formulations of genome-scale metabolic models with experimental data reveals universally essential cofactors in prokaryotes. Metab Eng 39:200. https://doi.org/10.1016/j.ymben.2016.12.002
Article CAS PubMed PubMed Central Google Scholar
Dias O, Pereira R, Gombert AK, Ferreira EC, Rocha I (2014) iOD907, the first genome-scale metabolic model for the milk yeast Kluyveromyces lactis. Biotechnol J 9:776–790. https://doi.org/10.1002/biot.201300242
Article CAS PubMed Google Scholar
Sauer U, Lasko DR, Fiaux J, Hochuli M, Glaser R, Szyperski T, Wuthrich K, Bailey JE (1999) Metabolic flux ratio analysis of genetic and environmental modulations of escherichia coli central carbon metabolism. J Bacteriol 181:6679–6688
CAS PubMed PubMed Central Google Scholar
Brohée S, Barriot R, Moreau Y, André B (2010) YTPdb: a wiki database of yeast membrane transporters. Biochim Biophys Acta 1798:1908–1912. https://doi.org/10.1016/j.bbamem.2010.06.008
Article PubMed Google Scholar
Saier MH, Reddy VS, Tamang DG, Västermark A (2014) The transporter classification database. Nucleic Acids Res 42:D251–D258. https://doi.org/10.1093/nar/gkt1097
Article CAS PubMed Google Scholar
Caspi R, Altman T, Dreher K, Fulcher CA, Subhraveti P, Keseler IM, Kothari A, Krummenacker M, Latendresse M, Mueller LA, Ong Q, Paley S, Pujar A, Shearer AG, Travers M, Weerasinghe D, Zhang P, Karp PD (2012) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 40:D742–D753. https://doi.org/10.1093/nar/gkr1014
Article CAS PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Centre of Biological Engineering, University of Minho, Braga, Portugal
Oscar Dias, Miguel Rocha, Eugénio Campos Ferreira & Isabel Rocha

Authors

Oscar Dias
View author publications
You can also search for this author in PubMed Google Scholar
Miguel Rocha
View author publications
You can also search for this author in PubMed Google Scholar
Eugénio Campos Ferreira
View author publications
You can also search for this author in PubMed Google Scholar
Isabel Rocha
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Oscar Dias .

Editor information

Editors and Affiliations

Department of Biology, University of Florence, Sesto Fiorentino, Florence, Italy
Marco Fondi

1 Electronic Supplementary Material

Fig. S1

NCBI assembly webpage. The genome can be accessed from the links on the right (GenBank—green arrow; RefSeq—dashed green arrow). Below other relevant links. The link inside the red ellipse allows retrieving the taxonomy identifier (blue circle) from the NCBI taxonomy database (PDF 723 kb)

Fig. S2

InterProScan report. Red circle—submenu for accessing the report. Genes with InterProScan’s reports are noticeable by buttons with purple background (PDF 471 kb)

Fig. S3

Transporters annotation panel. Black circle—information types available in the information window; Red ellipse—integrate similarity information with TRIAGE’s TAD; blue ellipse—create transport reactions; green ellipse—integrate to model or export information to tabular file. The information panel shows several ontology reactions, derived from the primary transporters’ annotations (PDF 418 kb)

Fig. S4

Compartments annotation panel. Secondary compartments may be annotated if the score is close to the one of the main compartments (PDF 367 kb)

Fig. S5

Growth rate versus ATP flux. The slope represents the growth ATP requirements and the y-intercept value indicates the maintenance ATP flux (PDF 28 kb)

Fig. S6

merlin’s main interface. The main interface has three main components, namely the operation bar (blue square), the clipboard (green square), and the data visualizer (red square) (PDF 187 kb)

Fig. S7

RefSeq multispecies annotation (PDF 262 kb)

Fig. S8

Flowchart for the annotation of new transporters from TCDB (PDF 206 kb)

Fig. S9

Example of plots for determining the specific growth rate (a) and specific consumption rate (b). In the former only the first five data points should be selected to perform the linear regression as the other do not belong to the exponential growth phase (PDF 94 kb)

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Dias, O., Rocha, M., Ferreira, E.C., Rocha, I. (2018). Reconstructing High-Quality Large-Scale Metabolic Models with merlin. In: Fondi, M. (eds) Metabolic Network Reconstruction and Modeling. Methods in Molecular Biology, vol 1716. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7528-0_1

Download citation

DOI: https://doi.org/10.1007/978-1-4939-7528-0_1
Published: 09 December 2017
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-7527-3
Online ISBN: 978-1-4939-7528-0
eBook Packages: Springer Protocols

Publish with us

Policies and ethics