Abstract
The frequent re-isolation of known compounds is one of the main challenges of traditional screening methods for natural products drug discovery. The ability to connect natural products to the genes that encode them and vice versa has the potential to revolutionize discovery efforts. Increasingly sophisticated bioinformatic tools are being developed that are able to not only identify biosynthetic genes in sequenced genomes but can also predict the product class or structure in silico. This information can then guide targeted discovery of new compounds. In this chapter, we will describe how to prioritize bacterial strains for genome sequencing and how biosynthetic gene clusters can be identified in bacterial genomes. We will also give a short introduction on how comparative genomics can help to identify different congeners of a specific class of natural products of interest and what the limitations of structure prediction are. We will not attempt to be exhaustive but will rather provide examples that the reader can actively follow.
References
Li JW, Vederas JC (2009) Drug discovery and natural products: end of an era or an endless frontier? Science 325:161–165
Harvey AL, Edrada-Ebel R, Quinn RJ (2015) The re-emergence of natural products for drug discovery in the genomics era. Nat Rev Drug Discov 14:111–129
Kellenberger E, Hofmann A, Quinn RJ (2011) Similar interactions of natural products with biosynthetic enzymes and therapeutic targets could explain why nature produces such a large proportion of existing drugs. Nat Prod Rep 28:1483–1492
Cragg GM, Newman DJ (2013) Natural products: a continuing source of novel drug leads. Biochim Biophys Acta 1830:3670–3695
Gerwick WH, Moore BS (2012) Lessons from the past and charting the future of marine natural products drug discovery and chemical biology. Chem Biol 19:85–98
Bentley SD, Chater KF, Cerdeno-Tarraga AM et al (2002) Complete genome sequence of the model actinomycete Streptomyces coelicolor A3(2). Nature 417:141–147
Ikeda H, Ishikawa J, Hanamoto A et al (2003) Complete genome sequence and comparative analysis of the industrial microorganism Streptomyces avermitilis. Nat Biotechnol 21:526–531
Udwary DW, Zeigler L, Asolkar RN et al (2007) Genome sequencing reveals complex secondary metabolome in the marine actinomycete Salinispora tropica. Proc Natl Acad Sci U S A 104:10376–10381
Bachmann BO, Van Lanen SG, Baltz RH (2014) Microbial genome mining for accelerated natural products discovery: is a renaissance in the making? J Ind Microbiol Biotechnol 41:175–184
Ikeda H, Kazuo SY, Omura S (2014) Genome mining of the Streptomyces avermitilis genome and development of genome-minimized hosts for heterologous expression of biosynthetic gene clusters. J Ind Microbiol Biotechnol 41:233–250
Gomez-Escribano JP, Bibb MJ (2014) Heterologous expression of natural product biosynthetic gene clusters in Streptomyces coelicolor: from genome mining to manipulation of biosynthetic pathways. J Ind Microbiol Biotechnol 41:425–431
Zhou Z, Xu Q, Bu Q et al (2015) Genome mining-directed activation of a silent angucycline biosynthetic gene cluster in Streptomyces chattanoogensis. Chembiochem 16:496–502
Challis GL (2014) Exploitation of the Streptomyces coelicolor A3(2) genome sequence for discovery of new natural products and biosynthetic pathways. J Ind Microbiol Biotechnol 41:219–232
Spohn M, Kirchner N, Kulik A et al (2014) Overproduction of Ristomycin A by activation of a silent gene cluster in Amycolatopsis japonicum MG417-CF17. Antimicrob Agents Chemother 58:6185–6196
Challis GL (2008) Genome mining for novel natural product discovery. J Med Chem 51:2618–2628
Ziemert N, Podell S, Penn K et al (2012) The natural product domain seeker NaPDoS: a phylogeny based bioinformatic tool to classify secondary metabolite gene diversity. PLoS One 7:e34064
Eustaquio AS, Nam SJ, Penn K et al (2011) The discovery of salinosporamide K from the Marine Bacterium “Salinispora pacifica” by genome mining gives insight into pathway evolution. Chembiochem 12:61–64
Nutzmann HW, Osbourn A (2014) Gene clustering in plant specialized metabolism. Curr Opin Biotechnol 26:91–99
Hertweck C (2009) The biosynthetic logic of polyketide diversity. Angew Chem Int Ed Engl 48:4688–4716
Piel J (2010) Biosynthesis of polyketides by trans-AT polyketide synthases. Nat Prod Rep 27:996–1047
Condurso HL, Bruner SD (2012) Structure and noncanonical chemistry of nonribosomal peptide biosynthetic machinery. Nat Prod Rep 29:1099–1110
Hur GH, Vickery CR, Burkart MD (2012) Explorations of catalytic domains in non-ribosomal peptide synthetase enzymology. Nat Prod Rep 29:1074–1098
Dunbar KL, Mitchell DA (2013) Revealing nature’s synthetic potential through the study of ribosomal natural product biosynthesis. ACS Chem Biol 8:473–487
Arnison PG, Bibb MJ, Bierbaum G et al (2013) Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature. Nat Prod Rep 30:108–160
Letzel AC, Pidot SJ, Hertweck C (2014) Genome mining for ribosomally synthesized and post-translationally modified peptides (RiPPs) in anaerobic bacteria. BMC Genomics 15:983
Mccranie EK, Bachmann BO (2014) Bioactive oligosaccharide natural products. Nat Prod Rep 31:1026–1042
Flatt PM, Mahmud T (2007) Biosynthesis of aminocyclitol-aminoglycoside antibiotics and related compounds. Nat Prod Rep 24:358–392
Cane DE, Ikeda H (2012) Exploration and mining of the bacterial terpenome. Acc Chem Res 45:463–472
Christianson DW (2006) Structural biology and chemistry of the terpenoid cyclases. Chem Rev 106:3412–3442
Cimermancic P, Medema MH, Claesen J et al (2014) Insights into secondary metabolism from a global analysis of prokaryotic biosynthetic gene clusters. Cell 158:412–421
Anand S, Prasad MV, Yadav G et al (2010) SBSPKS: structure based sequence analysis of polyketide synthases. Nucleic Acids Res 38:W487–W496
Li MH, Ung PM, Zajkowski J et al (2009) Automated genome mining for natural products. BMC Bioinformatics 10:185
Van Heel AJ, De Jong A, Montalban-Lopez M et al (2013) BAGEL3: automated identification of genes encoding bacteriocins and (non-)bactericidal posttranslationally modified peptides. Nucleic Acids Res 41:W448–W453
Weber T (2014) In silico tools for the analysis of antibiotic biosynthetic pathways. Int J Med Microbiol 304:230–235
Weber T, Blin K, Duddela S et al (2015) antiSMASH 3.0-a comprehensive resource for the genome mining of biosynthetic gene clusters. Nucleic Acids Res 43(W1):W237–W243
Jensen PR, Moore BS, Fenical W (2015) The marine actinomycete genus Salinispora: a model organism for secondary metabolite discovery. Nat Prod Rep 32:738–751
Gontang EA, Gaudencio SP, Fenical W et al (2010) Sequence-based analysis of secondary-metabolite biosynthesis in marine actinobacteria. Appl Environ Microbiol 76:2487–2499
Edlund A, Loesgen S, Fenical W et al (2011) Geographic distribution of secondary metabolite genes in the marine actinomycete Salinispora arenicola. Appl Environ Microbiol 77:5916–5925
Charlop-Powers Z, Owen JG, Reddy BV et al (2014) Chemical-biogeographic survey of secondary metabolism in soil. Proc Natl Acad Sci U S A 111:3757–3762
Moffitt MC, Neilan BA (2003) Evolutionary affiliations within the superfamily of ketosynthases reflect complex pathway associations. J Mol Evol 56:446–457
Morlon H, O'connor TK, Bryant JA et al (2015) The biogeography of putative microbial antibiotic production. PLoS One 10:e0130659
Muller CA, Oberauner-Wappis L, Peyman A et al (2015) Mining for nonribosomal peptide synthetase and polyketide synthase genes revealed a high level of diversity in the sphagnum bog metagenome. Appl Environ Microbiol 81:5064–5072
Donia MS, Fricke WF, Ravel J et al (2011) Variation in tropical reef symbiont metagenomes defined by secondary metabolism. PLoS One 6:e17897
Leikoski N, Fewer DP, Sivonen K (2009) Widespread occurrence and lateral transfer of the cyanobactin biosynthesis gene cluster in cyanobacteria. Appl Environ Microbiol 75:853–857
Ziemert N, Ishida K, Weiz A et al (2010) Exploiting the natural diversity of microviridin gene clusters for discovery of novel tricyclic depsipeptides. Appl Environ Microbiol 76:3568–3574
Chang FY, Ternei MA, Calle PY et al (2013) Discovery and synthetic refactoring of tryptophan dimer gene clusters from the environment. J Am Chem Soc 135:17906–17912
Owen JG, Charlop-Powers Z, Smith AG et al (2015) Multiplexed metagenome mining using short DNA sequence tags facilitates targeted discovery of epoxyketone proteasome inhibitors. Proc Natl Acad Sci U S A 112:4221–4226
Quince C, Lanzen A, Davenport RJ et al (2011) Removing noise from pyrosequenced amplicons. BMC Bioinformatics 12:38
Kuczynski J, Stombaugh J, Walters WA et al (2011) Using QIIME to analyze 16S rRNA gene sequences from microbial communities. Current protocols in bioinformatics/editoral board, Andreas D. Baxevanis ... [et al.] Chapter 10:Unit 10 17
Caporaso JG, Kuczynski J, Stombaugh J et al (2010) QIIME allows analysis of high-throughput community sequencing data. Nat Methods 7:335–336
Schloss PD, Westcott SL, Ryabin T et al (2009) Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol 75:7537–7541
Gaspar JM, Thomas WK (2013) Assessing the consequences of denoising marker-based metagenomic data. PLoS One 8:e60458
Woodhouse JN, Fan L, Brown MV et al (2013) Deep sequencing of non-ribosomal peptide synthetases and polyketide synthases from the microbiomes of Australian marine sponges. ISME J 7:1842–1851
Ichikawa N, Sasagawa M, Yamamoto M et al (2013) DoBISCUIT: a database of secondary metabolite biosynthetic gene clusters. Nucleic Acids Res 41:D408–D414
Conway KR, Boddy CN (2013) ClusterMine360: a database of microbial PKS/NRPS biosynthesis. Nucleic Acids Res 41:D402–D407
Field D, Garrity G, Gray T et al (2008) The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol 26:541–547
Medema MH, Kottmann R, Yilmaz P et al (2015) Minimum information about a biosynthetic gene cluster. Nat Chem Biol 11:625–631
Ziemert N, Jensen PR (2012) Phylogenetic approaches to natural product structure prediction. Methods Enzymol 517:161–182
Schmitt I, Barker FK (2009) Phylogenetic methods in natural product research. Nat Prod Rep 26:1585–1602
Reddy BV, Milshteyn A, Charlop-Powers Z et al (2014) eSNaPD: a versatile, web-based bioinformatics platform for surveying and mining natural product biosynthetic diversity from metagenomes. Chem Biol 21:1023–1033
Ziemert N, Lechner A, Wietz M et al (2014) Diversity and evolution of secondary metabolism in the marine actinomycete genus Salinispora. Proc Natl Acad Sci U S A 111:E1130–E1139
Duncan KR, Crusemann M, Lechner A et al (2015) Molecular networking and pattern-based genome mining improves discovery of biosynthetic gene clusters and their products from Salinispora species. Chem Biol 22:460–471
Calteau A, Fewer DP, Latifi A et al (2014) Phylum-wide comparative genomics unravel the diversity of secondary metabolism in Cyanobacteria. BMC Genomics 15:977
Doroghazi JR, Albright JC, Goering AW et al (2014) A roadmap for natural product discovery based on large-scale genomics and metabolomics. Nat Chem Biol 10:963–968
Medema MH, Takano E, Breitling R (2013) Detecting sequence homology at the gene cluster level with MultiGeneBlast. Mol Biol Evol 30:1218–1223
Deane CD, Mitchell DA (2014) Lessons learned from the transformation of natural product discovery to a genome-driven endeavor. J Ind Microbiol Biotechnol 41:315–331
Wyatt MA, Wang W, Roux CM et al (2010) Staphylococcus aureus nonribosomal peptide secondary metabolites regulate virulence. Science 329:294–296
Keatinge-Clay AT (2012) The structures of type I polyketide synthases. Nat Prod Rep 29:1050–1073
Lautru S, Challis GL (2004) Substrate recognition by nonribosomal peptide synthetase multi-enzymes. Microbiology 150:1629–1636
Rottig M, Medema MH, Blin K et al (2011) NRPSpredictor2—a web server for predicting NRPS adenylation domain specificity. Nucleic Acids Res 39:W362–W367
Bachmann BO, Ravel J (2009) Chapter 8. Methods for in silico prediction of microbial polyketide and nonribosomal peptide biosynthetic pathways from DNA sequence data. Methods Enzymol 458:181–217
Prieto C, Garcia-Estrada C, Lorenzana D et al (2012) NRPSsp: non-ribosomal peptide synthase substrate predictor. Bioinformatics 28:426–427
Khayatt BI, Overmars L, Siezen RJ et al (2013) Classification of the adenylation and acyl-transferase activity of NRPS and PKS systems using ensembles of substrate specific hidden Markov models. PLoS One 8:e62136
Nguyen T, Ishida K, Jenke-Kodama H et al (2008) Exploiting the mosaic structure of trans-acyltransferase polyketide synthases for natural product discovery and pathway dissection. Nat Biotechnol 26:225–233
Fischbach MA, Walsh CT (2006) Assembly-line enzymology for polyketide and nonribosomal peptide antibiotics: logic, machinery, and mechanisms. Chem Rev 106:3468–3496
Lautru S, Deeth RJ, Bailey LM et al (2005) Discovery of a new peptide natural product by Streptomyces coelicolor genome mining. Nat Chem Biol 1:265–269
Velasquez JE, Van Der Donk WA (2011) Genome mining for ribosomally synthesized natural products. Curr Opin Chem Biol 15:11–21
Mohimani H, Kersten RD, Liu WT et al (2014) Automated genome mining of ribosomal peptide natural products. ACS Chem Biol 9:1545–1551
Green MR, Sambrook J (2012) Molecular cloning: a laboratory manual. Cold Spring Harbor Laboratory Press, New York
Medema MH, Fischbach MA (2015) Computational approaches to natural product discovery. Nat Chem Biol 11:639–648
O’brien J, Wright GD (2011) An ecological perspective of microbial secondary metabolism. Curr Opin Biotechnol 22:552–558
Mcarthur AG, Waglechner N, Nizam F et al (2013) The comprehensive antibiotic resistance database. Antimicrob Agents Chemother 57:3348–3357
Thaker MN, Waglechner N, Wright GD (2014) Antibiotic resistance-mediated isolation of scaffold-specific natural product producers. Nat Protoc 9:1469–1479
Ginolhac A, Jarrin C, Gillet B et al (2004) Phylogenetic analysis of polyketide synthase I domains from soil metagenomic libraries allows selection of promising clones. Appl Environ Microbiol 70:5522–5527
Metsa-Ketela M, Salo V, Halo L et al (1999) An efficient approach for screening minimal PKS genes from Streptomyces. FEMS Microbiol Lett 180:1–6
Ayuso-Sacido A, Genilloud O (2005) New PCR primers for the screening of NRPS and PKS-I systems in actinomycetes: detection and distribution of these biosynthetic gene sequences in major taxonomic groups. Microb Ecol 49:10–24
Chang FY, Ternei MA, Calle PY et al (2015) Targeted metagenomics: finding rare tryptophan dimer natural products in the environment. J Am Chem Soc 137:6044–6052
Acknowledgments
The authors acknowledge the Department of Medicinal Chemistry and Pharmacognosy of the University of Illinois at Chicago and the Microbiology/Biotechnology Interfaculty Institute of Microbiology and Infection Medicine of the University of Tübingen for start-up funds during the course of writing this chapter.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Eustáquio, A.S., Ziemert, N. (2018). Identification of Natural Product Biosynthetic Gene Clusters from Bacterial Genomic Data. In: Methods in Pharmacology and Toxicology. Humana Press. https://doi.org/10.1007/7653_2018_32
Download citation
DOI: https://doi.org/10.1007/7653_2018_32
Published:
Publisher Name: Humana Press