Identification of Natural Product Biosynthetic Gene Clusters from Bacterial Genomic Data

Eustáquio, Alessandra S.; Ziemert, Nadine

doi:10.1007/7653_2018_32

Alessandra S. Eustáquio¹ &
Nadine Ziemert²

Part of the book series: Methods in Pharmacology and Toxicology

116 Accesses
2 Citations

Abstract

The frequent re-isolation of known compounds is one of the main challenges of traditional screening methods for natural products drug discovery. The ability to connect natural products to the genes that encode them and vice versa has the potential to revolutionize discovery efforts. Increasingly sophisticated bioinformatic tools are being developed that are able to not only identify biosynthetic genes in sequenced genomes but can also predict the product class or structure in silico. This information can then guide targeted discovery of new compounds. In this chapter, we will describe how to prioritize bacterial strains for genome sequencing and how biosynthetic gene clusters can be identified in bacterial genomes. We will also give a short introduction on how comparative genomics can help to identify different congeners of a specific class of natural products of interest and what the limitations of structure prediction are. We will not attempt to be exhaustive but will rather provide examples that the reader can actively follow.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Li JW, Vederas JC (2009) Drug discovery and natural products: end of an era or an endless frontier? Science 325:161–165
Google Scholar
Harvey AL, Edrada-Ebel R, Quinn RJ (2015) The re-emergence of natural products for drug discovery in the genomics era. Nat Rev Drug Discov 14:111–129
Google Scholar
Kellenberger E, Hofmann A, Quinn RJ (2011) Similar interactions of natural products with biosynthetic enzymes and therapeutic targets could explain why nature produces such a large proportion of existing drugs. Nat Prod Rep 28:1483–1492
Google Scholar
Cragg GM, Newman DJ (2013) Natural products: a continuing source of novel drug leads. Biochim Biophys Acta 1830:3670–3695
Google Scholar
Gerwick WH, Moore BS (2012) Lessons from the past and charting the future of marine natural products drug discovery and chemical biology. Chem Biol 19:85–98
Google Scholar
Bentley SD, Chater KF, Cerdeno-Tarraga AM et al (2002) Complete genome sequence of the model actinomycete Streptomyces coelicolor A3(2). Nature 417:141–147
Google Scholar
Ikeda H, Ishikawa J, Hanamoto A et al (2003) Complete genome sequence and comparative analysis of the industrial microorganism Streptomyces avermitilis. Nat Biotechnol 21:526–531
Google Scholar
Udwary DW, Zeigler L, Asolkar RN et al (2007) Genome sequencing reveals complex secondary metabolome in the marine actinomycete Salinispora tropica. Proc Natl Acad Sci U S A 104:10376–10381
Google Scholar
Bachmann BO, Van Lanen SG, Baltz RH (2014) Microbial genome mining for accelerated natural products discovery: is a renaissance in the making? J Ind Microbiol Biotechnol 41:175–184
Google Scholar
Ikeda H, Kazuo SY, Omura S (2014) Genome mining of the Streptomyces avermitilis genome and development of genome-minimized hosts for heterologous expression of biosynthetic gene clusters. J Ind Microbiol Biotechnol 41:233–250
Google Scholar
Gomez-Escribano JP, Bibb MJ (2014) Heterologous expression of natural product biosynthetic gene clusters in Streptomyces coelicolor: from genome mining to manipulation of biosynthetic pathways. J Ind Microbiol Biotechnol 41:425–431
Google Scholar
Zhou Z, Xu Q, Bu Q et al (2015) Genome mining-directed activation of a silent angucycline biosynthetic gene cluster in Streptomyces chattanoogensis. Chembiochem 16:496–502
Google Scholar
Challis GL (2014) Exploitation of the Streptomyces coelicolor A3(2) genome sequence for discovery of new natural products and biosynthetic pathways. J Ind Microbiol Biotechnol 41:219–232
Google Scholar
Spohn M, Kirchner N, Kulik A et al (2014) Overproduction of Ristomycin A by activation of a silent gene cluster in Amycolatopsis japonicum MG417-CF17. Antimicrob Agents Chemother 58:6185–6196
Google Scholar
Challis GL (2008) Genome mining for novel natural product discovery. J Med Chem 51:2618–2628
Google Scholar
Ziemert N, Podell S, Penn K et al (2012) The natural product domain seeker NaPDoS: a phylogeny based bioinformatic tool to classify secondary metabolite gene diversity. PLoS One 7:e34064
Google Scholar
Eustaquio AS, Nam SJ, Penn K et al (2011) The discovery of salinosporamide K from the Marine Bacterium “Salinispora pacifica” by genome mining gives insight into pathway evolution. Chembiochem 12:61–64
Google Scholar
Nutzmann HW, Osbourn A (2014) Gene clustering in plant specialized metabolism. Curr Opin Biotechnol 26:91–99
Google Scholar
Hertweck C (2009) The biosynthetic logic of polyketide diversity. Angew Chem Int Ed Engl 48:4688–4716
Google Scholar
Piel J (2010) Biosynthesis of polyketides by trans-AT polyketide synthases. Nat Prod Rep 27:996–1047
Google Scholar
Condurso HL, Bruner SD (2012) Structure and noncanonical chemistry of nonribosomal peptide biosynthetic machinery. Nat Prod Rep 29:1099–1110
Google Scholar
Hur GH, Vickery CR, Burkart MD (2012) Explorations of catalytic domains in non-ribosomal peptide synthetase enzymology. Nat Prod Rep 29:1074–1098
Google Scholar
Dunbar KL, Mitchell DA (2013) Revealing nature’s synthetic potential through the study of ribosomal natural product biosynthesis. ACS Chem Biol 8:473–487
Google Scholar
Arnison PG, Bibb MJ, Bierbaum G et al (2013) Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature. Nat Prod Rep 30:108–160
Google Scholar
Letzel AC, Pidot SJ, Hertweck C (2014) Genome mining for ribosomally synthesized and post-translationally modified peptides (RiPPs) in anaerobic bacteria. BMC Genomics 15:983
Google Scholar
Mccranie EK, Bachmann BO (2014) Bioactive oligosaccharide natural products. Nat Prod Rep 31:1026–1042
Google Scholar
Flatt PM, Mahmud T (2007) Biosynthesis of aminocyclitol-aminoglycoside antibiotics and related compounds. Nat Prod Rep 24:358–392
Google Scholar
Cane DE, Ikeda H (2012) Exploration and mining of the bacterial terpenome. Acc Chem Res 45:463–472
Google Scholar
Christianson DW (2006) Structural biology and chemistry of the terpenoid cyclases. Chem Rev 106:3412–3442
Google Scholar
Cimermancic P, Medema MH, Claesen J et al (2014) Insights into secondary metabolism from a global analysis of prokaryotic biosynthetic gene clusters. Cell 158:412–421
Google Scholar
Anand S, Prasad MV, Yadav G et al (2010) SBSPKS: structure based sequence analysis of polyketide synthases. Nucleic Acids Res 38:W487–W496
Google Scholar
Li MH, Ung PM, Zajkowski J et al (2009) Automated genome mining for natural products. BMC Bioinformatics 10:185
Google Scholar
Van Heel AJ, De Jong A, Montalban-Lopez M et al (2013) BAGEL3: automated identification of genes encoding bacteriocins and (non-)bactericidal posttranslationally modified peptides. Nucleic Acids Res 41:W448–W453
Google Scholar
Weber T (2014) In silico tools for the analysis of antibiotic biosynthetic pathways. Int J Med Microbiol 304:230–235
Google Scholar
Weber T, Blin K, Duddela S et al (2015) antiSMASH 3.0-a comprehensive resource for the genome mining of biosynthetic gene clusters. Nucleic Acids Res 43(W1):W237–W243
Google Scholar
Jensen PR, Moore BS, Fenical W (2015) The marine actinomycete genus Salinispora: a model organism for secondary metabolite discovery. Nat Prod Rep 32:738–751
Google Scholar
Gontang EA, Gaudencio SP, Fenical W et al (2010) Sequence-based analysis of secondary-metabolite biosynthesis in marine actinobacteria. Appl Environ Microbiol 76:2487–2499
Google Scholar
Edlund A, Loesgen S, Fenical W et al (2011) Geographic distribution of secondary metabolite genes in the marine actinomycete Salinispora arenicola. Appl Environ Microbiol 77:5916–5925
Google Scholar
Charlop-Powers Z, Owen JG, Reddy BV et al (2014) Chemical-biogeographic survey of secondary metabolism in soil. Proc Natl Acad Sci U S A 111:3757–3762
Google Scholar
Moffitt MC, Neilan BA (2003) Evolutionary affiliations within the superfamily of ketosynthases reflect complex pathway associations. J Mol Evol 56:446–457
Google Scholar
Morlon H, O'connor TK, Bryant JA et al (2015) The biogeography of putative microbial antibiotic production. PLoS One 10:e0130659
Google Scholar
Muller CA, Oberauner-Wappis L, Peyman A et al (2015) Mining for nonribosomal peptide synthetase and polyketide synthase genes revealed a high level of diversity in the sphagnum bog metagenome. Appl Environ Microbiol 81:5064–5072
Google Scholar
Donia MS, Fricke WF, Ravel J et al (2011) Variation in tropical reef symbiont metagenomes defined by secondary metabolism. PLoS One 6:e17897
Google Scholar
Leikoski N, Fewer DP, Sivonen K (2009) Widespread occurrence and lateral transfer of the cyanobactin biosynthesis gene cluster in cyanobacteria. Appl Environ Microbiol 75:853–857
Google Scholar
Ziemert N, Ishida K, Weiz A et al (2010) Exploiting the natural diversity of microviridin gene clusters for discovery of novel tricyclic depsipeptides. Appl Environ Microbiol 76:3568–3574
Google Scholar
Chang FY, Ternei MA, Calle PY et al (2013) Discovery and synthetic refactoring of tryptophan dimer gene clusters from the environment. J Am Chem Soc 135:17906–17912
Google Scholar
Owen JG, Charlop-Powers Z, Smith AG et al (2015) Multiplexed metagenome mining using short DNA sequence tags facilitates targeted discovery of epoxyketone proteasome inhibitors. Proc Natl Acad Sci U S A 112:4221–4226
Google Scholar
Quince C, Lanzen A, Davenport RJ et al (2011) Removing noise from pyrosequenced amplicons. BMC Bioinformatics 12:38
Google Scholar
Kuczynski J, Stombaugh J, Walters WA et al (2011) Using QIIME to analyze 16S rRNA gene sequences from microbial communities. Current protocols in bioinformatics/editoral board, Andreas D. Baxevanis ... [et al.] Chapter 10:Unit 10 17
Google Scholar
Caporaso JG, Kuczynski J, Stombaugh J et al (2010) QIIME allows analysis of high-throughput community sequencing data. Nat Methods 7:335–336
Google Scholar
Schloss PD, Westcott SL, Ryabin T et al (2009) Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol 75:7537–7541
Google Scholar
Gaspar JM, Thomas WK (2013) Assessing the consequences of denoising marker-based metagenomic data. PLoS One 8:e60458
Google Scholar
Woodhouse JN, Fan L, Brown MV et al (2013) Deep sequencing of non-ribosomal peptide synthetases and polyketide synthases from the microbiomes of Australian marine sponges. ISME J 7:1842–1851
Google Scholar
Ichikawa N, Sasagawa M, Yamamoto M et al (2013) DoBISCUIT: a database of secondary metabolite biosynthetic gene clusters. Nucleic Acids Res 41:D408–D414
Google Scholar
Conway KR, Boddy CN (2013) ClusterMine360: a database of microbial PKS/NRPS biosynthesis. Nucleic Acids Res 41:D402–D407
Google Scholar
Field D, Garrity G, Gray T et al (2008) The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol 26:541–547
Google Scholar
Medema MH, Kottmann R, Yilmaz P et al (2015) Minimum information about a biosynthetic gene cluster. Nat Chem Biol 11:625–631
Google Scholar
Ziemert N, Jensen PR (2012) Phylogenetic approaches to natural product structure prediction. Methods Enzymol 517:161–182
Google Scholar
Schmitt I, Barker FK (2009) Phylogenetic methods in natural product research. Nat Prod Rep 26:1585–1602
Google Scholar
Reddy BV, Milshteyn A, Charlop-Powers Z et al (2014) eSNaPD: a versatile, web-based bioinformatics platform for surveying and mining natural product biosynthetic diversity from metagenomes. Chem Biol 21:1023–1033
Google Scholar
Ziemert N, Lechner A, Wietz M et al (2014) Diversity and evolution of secondary metabolism in the marine actinomycete genus Salinispora. Proc Natl Acad Sci U S A 111:E1130–E1139
Google Scholar
Duncan KR, Crusemann M, Lechner A et al (2015) Molecular networking and pattern-based genome mining improves discovery of biosynthetic gene clusters and their products from Salinispora species. Chem Biol 22:460–471
Google Scholar
Calteau A, Fewer DP, Latifi A et al (2014) Phylum-wide comparative genomics unravel the diversity of secondary metabolism in Cyanobacteria. BMC Genomics 15:977
Google Scholar
Doroghazi JR, Albright JC, Goering AW et al (2014) A roadmap for natural product discovery based on large-scale genomics and metabolomics. Nat Chem Biol 10:963–968
Google Scholar
Medema MH, Takano E, Breitling R (2013) Detecting sequence homology at the gene cluster level with MultiGeneBlast. Mol Biol Evol 30:1218–1223
Google Scholar
Deane CD, Mitchell DA (2014) Lessons learned from the transformation of natural product discovery to a genome-driven endeavor. J Ind Microbiol Biotechnol 41:315–331
Google Scholar
Wyatt MA, Wang W, Roux CM et al (2010) Staphylococcus aureus nonribosomal peptide secondary metabolites regulate virulence. Science 329:294–296
Google Scholar
Keatinge-Clay AT (2012) The structures of type I polyketide synthases. Nat Prod Rep 29:1050–1073
Google Scholar
Lautru S, Challis GL (2004) Substrate recognition by nonribosomal peptide synthetase multi-enzymes. Microbiology 150:1629–1636
Google Scholar
Rottig M, Medema MH, Blin K et al (2011) NRPSpredictor2—a web server for predicting NRPS adenylation domain specificity. Nucleic Acids Res 39:W362–W367
Google Scholar
Bachmann BO, Ravel J (2009) Chapter 8. Methods for in silico prediction of microbial polyketide and nonribosomal peptide biosynthetic pathways from DNA sequence data. Methods Enzymol 458:181–217
Google Scholar
Prieto C, Garcia-Estrada C, Lorenzana D et al (2012) NRPSsp: non-ribosomal peptide synthase substrate predictor. Bioinformatics 28:426–427
Google Scholar
Khayatt BI, Overmars L, Siezen RJ et al (2013) Classification of the adenylation and acyl-transferase activity of NRPS and PKS systems using ensembles of substrate specific hidden Markov models. PLoS One 8:e62136
Google Scholar
Nguyen T, Ishida K, Jenke-Kodama H et al (2008) Exploiting the mosaic structure of trans-acyltransferase polyketide synthases for natural product discovery and pathway dissection. Nat Biotechnol 26:225–233
Google Scholar
Fischbach MA, Walsh CT (2006) Assembly-line enzymology for polyketide and nonribosomal peptide antibiotics: logic, machinery, and mechanisms. Chem Rev 106:3468–3496
Google Scholar
Lautru S, Deeth RJ, Bailey LM et al (2005) Discovery of a new peptide natural product by Streptomyces coelicolor genome mining. Nat Chem Biol 1:265–269
Google Scholar
Velasquez JE, Van Der Donk WA (2011) Genome mining for ribosomally synthesized natural products. Curr Opin Chem Biol 15:11–21
Google Scholar
Mohimani H, Kersten RD, Liu WT et al (2014) Automated genome mining of ribosomal peptide natural products. ACS Chem Biol 9:1545–1551
Google Scholar
Green MR, Sambrook J (2012) Molecular cloning: a laboratory manual. Cold Spring Harbor Laboratory Press, New York
Google Scholar
Medema MH, Fischbach MA (2015) Computational approaches to natural product discovery. Nat Chem Biol 11:639–648
Google Scholar
O’brien J, Wright GD (2011) An ecological perspective of microbial secondary metabolism. Curr Opin Biotechnol 22:552–558
Google Scholar
Mcarthur AG, Waglechner N, Nizam F et al (2013) The comprehensive antibiotic resistance database. Antimicrob Agents Chemother 57:3348–3357
Google Scholar
Thaker MN, Waglechner N, Wright GD (2014) Antibiotic resistance-mediated isolation of scaffold-specific natural product producers. Nat Protoc 9:1469–1479
Google Scholar
Ginolhac A, Jarrin C, Gillet B et al (2004) Phylogenetic analysis of polyketide synthase I domains from soil metagenomic libraries allows selection of promising clones. Appl Environ Microbiol 70:5522–5527
Google Scholar
Metsa-Ketela M, Salo V, Halo L et al (1999) An efficient approach for screening minimal PKS genes from Streptomyces. FEMS Microbiol Lett 180:1–6
Google Scholar
Ayuso-Sacido A, Genilloud O (2005) New PCR primers for the screening of NRPS and PKS-I systems in actinomycetes: detection and distribution of these biosynthetic gene sequences in major taxonomic groups. Microb Ecol 49:10–24
Google Scholar
Chang FY, Ternei MA, Calle PY et al (2015) Targeted metagenomics: finding rare tryptophan dimer natural products in the environment. J Am Chem Soc 137:6044–6052
Google Scholar

Download references

Acknowledgments

The authors acknowledge the Department of Medicinal Chemistry and Pharmacognosy of the University of Illinois at Chicago and the Microbiology/Biotechnology Interfaculty Institute of Microbiology and Infection Medicine of the University of Tübingen for start-up funds during the course of writing this chapter.

Author information

Authors and Affiliations

Department of Medicinal Chemistry and Pharmacognosy and Center for Biomolecular Sciences, College of Pharmacy, University of Illinois at Chicago, Chicago, IL, USA
Alessandra S. Eustáquio
Microbiology/Biotechnology Interfaculty Institute of Microbiology and Infection Medicine, University of Tübingen, Tübingen, Germany
Nadine Ziemert

Authors

Alessandra S. Eustáquio
View author publications
You can also search for this author in PubMed Google Scholar
Nadine Ziemert
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Alessandra S. Eustáquio or Nadine Ziemert .

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Eustáquio, A.S., Ziemert, N. (2018). Identification of Natural Product Biosynthetic Gene Clusters from Bacterial Genomic Data. In: Methods in Pharmacology and Toxicology. Humana Press. https://doi.org/10.1007/7653_2018_32

Download citation

DOI: https://doi.org/10.1007/7653_2018_32
Published: 17 August 2018
Publisher Name: Humana Press

Publish with us

Policies and ethics