Skip to main content

Reconstructing High-Quality Large-Scale Metabolic Models with merlin

  • Protocol
  • First Online:
Metabolic Network Reconstruction and Modeling

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1716))

Abstract

Here, the basic principles of reconstructing genome-scale metabolic models with merlin are described. This tool covers the basic stages of this process, providing several tools that allow assembling models, using the sequenced genome as a starting point.

merlin has two main modules, separating the process of annotating (enzymes, transporters, and compartments) on the genome from the process of model assembly, though information from the former is integrated in the latter after curation. Moreover, merlin provides several tools to curate the model, including tools for generating reactions’ gene rules and placeholder entities for biomass precursors, such as proteins (e-protein) or nucleotides (e-DNA and e-RNA) among others.

This tutorial covers each feature of merlin in detail, including the assessment of experimental data for the validation of the model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Otero JM, Nielsen J (2010) Industrial systems biology. Biotechnol Bioeng 105:439–460. https://doi.org/10.1002/bit.22592

    Article  CAS  PubMed  Google Scholar 

  2. Kitano H (2002) Systems biology: a brief overview. Science 295:1662–1664. https://doi.org/10.1126/science.1069492

    Article  CAS  PubMed  Google Scholar 

  3. Dias O, Rocha I (2015) Systems biology in fungi. In: Paterson R (ed) Mol. Biol. Food water borne mycotoxigenic mycotic fungi. CRC Press, Boca Raton, FL, pp 69–92

    Google Scholar 

  4. gismo Meaning in the Cambridge English Dictionary. http://dictionary.cambridge.org/dictionary/english/gismo#translations. Accessed 13 Apr 2017

    Google Scholar 

  5. Gizmo definition and meaning | Collins English Dictionary. https://www.collinsdictionary.com/dictionary/english/gizmo. Accessed 13 Apr 2017

    Google Scholar 

  6. Thiele I, Palsson BØ (2010) A protocol for generating a high-quality genome-scale metabolic reconstruction. Nat Protoc 5:93–121. https://doi.org/10.1038/nprot.2009.203

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Dias O, Rocha M, Ferreira EC, Rocha I (2015) Reconstructing genome-scale metabolic models with merlin. Nucleic Acids Res 43:3899–3910. https://doi.org/10.1093/nar/gkv294

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Henry CS, DeJongh M, Best AA, Frybarger PM, Linsay B, Stevens RL (2010) High-throughput generation, optimization and analysis of genome-scale metabolic models. Nat Biotechnol 28:977–982. https://doi.org/10.1038/nbt.1672

    Article  CAS  PubMed  Google Scholar 

  9. Hucka M, Finney A, Sauro HM, Bolouri H, Doyle JC, Kitano H, Arkin AP, Bornstein BJ, Bray D, Cornish-Bowden A, Cuellar AA, Dronov S, Gilles ED, Ginkel M, Gor V, Goryanin II, Hedley WJ, Hodgman TC, Hofmeyr J-H, Hunter PJ, Juty NS, Kasberger JL, Kremling A, Kummer U, Le Novère N, Loew LM, Lucio D, Mendes P, Minch E, Mjolsness ED, Nakayama Y, Nelson MR, Nielsen PF, Sakurada T, Schaff JC, Shapiro BE, Shimizu TS, Spence HD, Stelling J, Takahashi K, Tomita M, Wagner J, Wang J (2003) The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 19:524–531. https://doi.org/10.1093/bioinformatics/btg015

    Article  CAS  PubMed  Google Scholar 

  10. Rocha I, Maia P, Evangelista P, Vilaça P, Soares S, Pinto JP, Nielsen J, Patil KR, Ferreira EC, Rocha M (2010) OptFlux: an open-source software platform for in silico metabolic engineering. BMC Syst Biol 4:45. https://doi.org/10.1186/1752-0509-4-45

    Article  PubMed  PubMed Central  Google Scholar 

  11. Schellenberger J, Que R, Fleming RMT, Thiele I, Orth JD, Feist AM, Zielinski DC, Bordbar A, Lewis NE, Rahmanian S, Kang J, Hyduke DR, Palsson BØ (2011) Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0. Nat Protoc 6:1290–1307. https://doi.org/10.1038/nprot.2011.308

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Le Novère N, Finney A, Hucka M, Bhalla US, Campagne F, Collado-Vides J, Crampin EJ, Halstead M, Klipp E, Mendes P, Nielsen P, Sauro H, Shapiro B, Snoep JL, Spence HD, Wanner BL (2005) Minimum information requested in the annotation of biochemical models (MIRIAM). Nat Biotechnol 23:1509–1515. https://doi.org/10.1038/nbt1156

    Article  PubMed  Google Scholar 

  13. Glez-Peña D, Reboiro-Jato M, Maia P, Rocha M, Díaz F, Fdez-Riverola F (2010) AIBench: a rapid application development framework for translational research in biomedicine. Comput Methods Programs Biomed 98:191–203. https://doi.org/10.1016/j.cmpb.2009.12.003

    Article  PubMed  Google Scholar 

  14. UniProt Consortium (2015) UniProt: a hub for protein information. Nucleic Acids Res 43:D204–D212. https://doi.org/10.1093/nar/gku989

    Article  Google Scholar 

  15. Boutet E, Lieberherr D, Tognolli M, Schneider M, Bansal P, Bridge AJ, Poux S, Bougueleret L, Xenarios I (2016) UniProtKB/Swiss-Prot, the Manually Annotated Section of the UniProt KnowledgeBase: how to use the entry view. Methods Mol Biol 1374:23–54. https://doi.org/10.1007/978-1-4939-3167-5_2

    Article  CAS  PubMed  Google Scholar 

  16. Sayers EW, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Edgar R, Federhen S, Feolo M, Geer LY, Helmberg W, Kapustin Y, Landsman D, Lipman DJ, Madden TL, Maglott DR, Miller V, Mizrachi I, Ostell J, Pruitt KD, Schuler GD, Sequeira E, Sherry ST, Shumway M, Sirotkin K, Souvorov A, Starchenko G, Tatusova TA, Wagner L, Yaschenko E, Ye J (2009) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 37:D5–15. https://doi.org/10.1093/nar/gkn741

    Article  CAS  PubMed  Google Scholar 

  17. Schomburg I, Chang A, Schomburg D (2002) BRENDA, enzyme data and metabolic information. Nucleic Acids Res 30:47–49

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M (1999) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 27:29–34. https://doi.org/10.1093/nar/27.1.29

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Lipman DJ, Pearson WRW (1985) Rapid and sensitive protein similarity searches. Science 227:1435–1441. PMID: 2983426

    Article  CAS  PubMed  Google Scholar 

  20. Federhen S (2012) The NCBI Taxonomy database. Nucleic Acids Res 40:D136–D143. https://doi.org/10.1093/nar/gkr1178

    Article  CAS  PubMed  Google Scholar 

  21. Kitts PA, Church DM, Thibaud-Nissen F, Choi J, Hem V, Sapojnikov V, Smith RG, Tatusova T, Xiang C, Zherikov A, DiCuccio M, Murphy TD, Pruitt KD, Kimchi A (2016) Assembly: a resource for assembled genomes at NCBI. Nucleic Acids Res 44:D73–D80. https://doi.org/10.1093/nar/gkv1226

    Article  CAS  PubMed  Google Scholar 

  22. mysql-server - Linux Mint Community. https://community.linuxmint.com/software/view/mysql-server. Accessed 13 Apr 2017

    Google Scholar 

  23. MySQL :: About MySQL. https://www.mysql.com/about/. Accessed 13 Apr 2017

    Google Scholar 

  24. Pearson WR (2013) An introduction to sequence similarity (“Homology”) searching. In: Curr. Protoc. Bioinforma. John Wiley & Sons, Inc., Hoboken, NJ, pp 3.1.1–3.1.8

    Google Scholar 

  25. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410. https://doi.org/10.1016/S0022-2836(05)80360-2

    Article  CAS  PubMed  Google Scholar 

  26. Finn RD, Clements J, Eddy SR (2011) HMMER web server: interactive sequence similarity searching. Nucleic Acids Res 39:W29–W37. https://doi.org/10.1093/nar/gkr367

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Magrane M, Consortium UP (2011) UniProt Knowledgebase: a hub of integrated protein data. Database. https://doi.org/10.1093/database/bar009

  28. Dias O, Gomes D, Vilaca P, Cardoso J, Rocha M, Ferreira E, Rocha I (2017) Genome-wide semi-automated annotation of transporter systems. IEEE/ACM Trans Comput Biol Bioinforma 14:443. https://doi.org/10.1109/TCBB.2016.2527647

    Article  Google Scholar 

  29. Yu NY, Wagner JR, Laird MR, Melli G, Rey S, Lo R, Dao P, Sahinalp SC, Ester M, Foster LJ, Brinkman FSL (2010) PSORTb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes. Bioinformatics 26:1608–1615. https://doi.org/10.1093/bioinformatics/btq249

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Goldberg T, Hecht M, Hamp T, Karl T, Yachdav G, Ahmed N, Altermann U, Angerer P, Ansorge S, Balasz K, Bernhofer M, Betz A, Cizmadija L, Do KT, Gerke J, Greil R, Joerdens V, Hastreiter M, Hembach K, Herzog M, Kalemanov M, Kluge M, Meier A, Nasir H, Neumaier U, Prade V, Reeb J, Sorokoumov A, Troshani I, Vorberg S, Waldraff S, Zierer J, Nielsen H, Rost B (2014) LocTree3 prediction of localization. Nucleic Acids Res 42:W350–W355. https://doi.org/10.1093/nar/gku396

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Saier MH (2000) A functional-phylogenetic classification system for transmembrane solute transporters. Microbiol Mol Biol Rev 64:354–411

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Sonnhammer EL, von Heijne G, Krogh A (1998) A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol 6:175–182

    CAS  PubMed  Google Scholar 

  33. Käll L, Krogh A, Sonnhammer ELL (2004) A combined transmembrane topology and signal peptide prediction method. J Mol Biol 338:1027–1036. https://doi.org/10.1016/j.jmb.2004.03.016

    Article  PubMed  Google Scholar 

  34. Moller S, Croning MDR, Apweiler R, Möller S (2001) Evaluation of methods for the prediction of membrane spanning regions. Bioinformatics 17:646–653. https://doi.org/10.1093/bioinformatics/17.7.646

    Article  CAS  PubMed  Google Scholar 

  35. Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147:195–197. https://doi.org/10.1016/0022-2836(81)90087-5

    Article  CAS  PubMed  Google Scholar 

  36. Gardy JL, Brinkman FSL (2006) Methods for predicting bacterial protein subcellular localization. Nat Rev Microbiol 4:741–751. https://doi.org/10.1038/nrmicro1494

    Article  CAS  PubMed  Google Scholar 

  37. Ma H, Zeng A-P (2003) Reconstruction of metabolic networks from genome data and analysis of their global structure for various organisms. Bioinformatics 19:270–277. https://doi.org/10.1093/bioinformatics/19.2.270

    Article  CAS  PubMed  Google Scholar 

  38. Stelzer M, Sun J, Kamphans T, Fekete SP, Zeng A-P (2011) An extended bioreaction database that significantly improves reconstruction and analysis of genome-scale metabolic networks. Integr Biol (Camb) 3:1071–1086. https://doi.org/10.1039/c1ib00008j

    Article  CAS  Google Scholar 

  39. Tanabe M, Kanehisa M (2012) Using the KEGG database resource. Curr Protoc Bioinformatics Chapter 1:Unit1.12. doi: https://doi.org/10.1002/0471250953.bi0112s38

  40. Varma A, Palsson BO (1993) Metabolic capabilities of Escherichia coli II. Optimal growth patterns. J Theor Biol 165:503–522. https://doi.org/10.1006/jtbi.1993.1203

    Article  CAS  Google Scholar 

  41. Santos ST (2013) Development of computational methods for the determination of biomass composition and evaluation of its impact in genome-scale models predictions. Universidade do Minho

    Google Scholar 

  42. Santos S, Rocha I (2016) Estimation of biomass composition from genomic and transcriptomic information. J Integr Bioinform. https://doi.org/10.2390/biecoll-jib-2016-285

  43. Xavier JC, Patil KR, Rocha I (2017) Integration of biomass formulations of genome-scale metabolic models with experimental data reveals universally essential cofactors in prokaryotes. Metab Eng 39:200. https://doi.org/10.1016/j.ymben.2016.12.002

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Dias O, Pereira R, Gombert AK, Ferreira EC, Rocha I (2014) iOD907, the first genome-scale metabolic model for the milk yeast Kluyveromyces lactis. Biotechnol J 9:776–790. https://doi.org/10.1002/biot.201300242

    Article  CAS  PubMed  Google Scholar 

  45. Sauer U, Lasko DR, Fiaux J, Hochuli M, Glaser R, Szyperski T, Wuthrich K, Bailey JE (1999) Metabolic flux ratio analysis of genetic and environmental modulations of escherichia coli central carbon metabolism. J Bacteriol 181:6679–6688

    CAS  PubMed  PubMed Central  Google Scholar 

  46. Brohée S, Barriot R, Moreau Y, André B (2010) YTPdb: a wiki database of yeast membrane transporters. Biochim Biophys Acta 1798:1908–1912. https://doi.org/10.1016/j.bbamem.2010.06.008

    Article  PubMed  Google Scholar 

  47. Saier MH, Reddy VS, Tamang DG, Västermark A (2014) The transporter classification database. Nucleic Acids Res 42:D251–D258. https://doi.org/10.1093/nar/gkt1097

    Article  CAS  PubMed  Google Scholar 

  48. Caspi R, Altman T, Dreher K, Fulcher CA, Subhraveti P, Keseler IM, Kothari A, Krummenacker M, Latendresse M, Mueller LA, Ong Q, Paley S, Pujar A, Shearer AG, Travers M, Weerasinghe D, Zhang P, Karp PD (2012) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 40:D742–D753. https://doi.org/10.1093/nar/gkr1014

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Oscar Dias .

Editor information

Editors and Affiliations

1 Electronic Supplementary Material

Fig. S1

NCBI assembly webpage. The genome can be accessed from the links on the right (GenBank—green arrow; RefSeq—dashed green arrow). Below other relevant links. The link inside the red ellipse allows retrieving the taxonomy identifier (blue circle) from the NCBI taxonomy database (PDF 723 kb)

Fig. S2

InterProScan report. Red circle—submenu for accessing the report. Genes with InterProScan’s reports are noticeable by buttons with purple background (PDF 471 kb)

Fig. S3

Transporters annotation panel. Black circle—information types available in the information window; Red ellipse—integrate similarity information with TRIAGE’s TAD; blue ellipse—create transport reactions; green ellipse—integrate to model or export information to tabular file. The information panel shows several ontology reactions, derived from the primary transporters’ annotations (PDF 418 kb)

Fig. S4

Compartments annotation panel. Secondary compartments may be annotated if the score is close to the one of the main compartments (PDF 367 kb)

Fig. S5

Growth rate versus ATP flux. The slope represents the growth ATP requirements and the y-intercept value indicates the maintenance ATP flux (PDF 28 kb)

Fig. S6

merlin’s main interface. The main interface has three main components, namely the operation bar (blue square), the clipboard (green square), and the data visualizer (red square) (PDF 187 kb)

Fig. S7

RefSeq multispecies annotation (PDF 262 kb)

Fig. S8

Flowchart for the annotation of new transporters from TCDB (PDF 206 kb)

Fig. S9

Example of plots for determining the specific growth rate (a) and specific consumption rate (b). In the former only the first five data points should be selected to perform the linear regression as the other do not belong to the exponential growth phase (PDF 94 kb)

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Dias, O., Rocha, M., Ferreira, E.C., Rocha, I. (2018). Reconstructing High-Quality Large-Scale Metabolic Models with merlin. In: Fondi, M. (eds) Metabolic Network Reconstruction and Modeling. Methods in Molecular Biology, vol 1716. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7528-0_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-7528-0_1

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-7527-3

  • Online ISBN: 978-1-4939-7528-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics