Encyclopedia of Metagenomics

Living Edition
| Editors: Karen E. Nelson

Bacteriocin Mining in Metagenomes

  • Orla O’SullivanEmail author
  • Colin Hill
  • Paul Ross
  • Paul Cotter
Living reference work entry
DOI: https://doi.org/10.1007/978-1-4614-6418-1_689-3


Lactic Acid Bacterium Bacteriocin Producer Galapagos Island Metagenomic Database Driver Sequence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Bacteriocins are heat-stable ribosomally synthesized peptides produced by one bacterium which are active against other bacteria and against which the producer has a specific immunity mechanism.


Bacteriocins are ribosomally synthesized antimicrobial peptides that are produced by many bacteria and which kill or inhibit the growth of other bacteria. Bacteriocin producers are protected as a consequence of dedicated immunity (self-protective) systems (Cotter et al. 2005). Bacteriocins are of both academic and commercial interest, with several in use as food preservatives or as the active agent in clinical or veterinary antimicrobials. It is not surprising that there is significant interest in the identification and characterization of new bacteriocin gene clusters. The growing volume of metagenomic sequence data is an important resource which can be mined for the in silico discovery of novel bacteriocins.

A Background to Bacteriocins

Bacteriocins were first described in 1925 and since then bacteriocin producers have been identified in a myriad of different environments, bearing out a prediction by Klaenhammer in 1988 that bacteriocin production may be almost ubiquitous (Klaenhammer 1988). The spectrum of activity of these peptides can be narrow (lethal to bacteria in the same or closely related species) or broad (lethal to bacteria in other genera). Many bacteriocins function by depolarizing the cell membrane or through the inhibition of cell wall synthesis (Cotter et al. 2005). There are a number of different classification schemes. One approach, originally employed to classify bacteriocins produced by Gram-positive bacteria, has been to divide bacteriocins into two major classes: those which are modified (Class I) and those which are unmodified (Class II) (Cotter et al. 2005; Rea et al. 2011) (Table 1). This approach to classification excludes larger proteins, such as the bacteriolysins and the colicin-type antimicrobials, which as a consequence of their larger size may be regarded as representing different classes of antimicrobials.
Table 1

Classification scheme for bacteriocins (Modified from (Rea et al. 2011))



Further subclasses


Class I

Ia: Lantibiotics

Subclass I–IV

Lacticin 3,147, nisin A, subtilin

Ib: Labyrinthopeptins

Labyrinthopeptins A1 and A2

Ic: Sactibiotics

Single- and two-peptide bacteriocins

Thuricin CD, subtilisin A

Class II

IIa: Pediocin-like

Subclasses I–IV

Pediocin PA-1, munticin

IIb: Two-peptide bacteriocins

Subclasses A and B

Salivaricin P, lactococcin G

IIc: Circular bacteriocins

Subclasses 1 and 2

Acidocin B, gassericin A

IId: Linear non-pediocin-like


Lactococcin A

Single-peptide bacteriocins

Further classification of the Class I and II peptides is possible, for example, Class I bacteriocins from Gram-positive bacteria can be divided into Class Ia, Class Ib, and Class Ic. Class Ia, the lantibiotics, harbor the unusual posttranslationally modified residues lanthionine (Lan) and/or β-methyllanthionine (meLan); these are products of the interaction of cysteines with enzymatically dehydrated serines (dehydroalanine; Dha) and threonines (dehydrobutyrine; Dhb). Lantibiotics can be subdivided according to the enzyme responsible for lanthionine formation; subclass I use LanBC, subclass II use LanM, and subclass III use RamC-like, while subclass IV are modified by LanL enzymes. It should be noted, however, that subclass III and IV peptides identified to date have not been shown to possess antimicrobial activity and thus are referred to as lantipeptides. Class Ib, the labyrinthopeptins, have a labyrinthine structure and contain the posttranslationally modified amino acid labionin, formed through a series of serine phosphorylations, dehydrations of phosphoserines to didehydroalanines, and cyclizations. Class Ic, the sactibiotics, are cyclic peptides, generated from the posttranslation formation of intramolecular cross-linkages between the α-carbon and sulfur of amino acids within the peptide. Class Ic bacteriocins can be further subdivided depending on whether they are single- or two-peptide bacteriocins.

Class II bacteriocins can be divided in Class IIa, Class IIb, Class IIc, and Class IId. Class IIa, pediocin-like bacteriocins, are typically highly active against the food pathogen Listeria monocytogenes and contain a conserved hydrophilic, cationic region in the N-terminal region termed the “pediocin box.” They can be subdivided into subclasses I–IV based on sequence homology. Class IIb are two-peptide unmodified bacteriocins. Both peptides are required for activity and both possess a conserved GxxG motif. These are further subdivided into two subclasses based on sequence homology. Class IIc are cyclic peptides, resulting from the covalent linkage of their N- and C-termini and tend to contain numerous α-helical structures. Class IIc can also be further divided into two subclasses based on sequence identity. Finally, the Class IId, unmodified linear, non-pediocin-like bacteriocins are in essence bacteriocins which do not fit into any of the other subclasses. (See Fig. 1 for an example of bacteriocin structure.)
Fig. 1

Structure of nisin A; the prototypical Gram-positive-modified bacteriocin (Modified residues in gray)

In addition to the requirement for a precursor bacteriocin peptide, bacteriocin activity is also dependent on the production of several other proteins encoded within the corresponding bacteriocin gene cluster. This gene cluster may encode proteins responsible for bacteriocin transport, processing, regulation, immunity, and, in the case of the Class 1 bacteriocins, peptide modification enzymes. The highly conserved accessory proteins encoded by bacteriocin gene clusters can serve as useful driver sequences for downstream analysis as the bacteriocin peptides themselves can be very diverse in their primary sequences.

Application of Bacteriocins

Bacteriocins have proved useful as antimicrobial compounds in the food and health industries. In the food industry, bacteriocins such as nisin and pediocin PA-1 can improve food safety and food quality. Bacteriocins produced by lactic acid bacteria (LAB) are of particular interest to the food industry since LAB have been awarded GRAS status (Generally Regarded As Safe) and can therefore be used in food preparations (Cotter et al. 2005). More recently, the contribution of bacteriocin production to the efficacy of certain probiotics has been recognized, suggesting another route via which bacteriocins can be of value within the food for health arena (Dobson et al. 2012). In the health industry, the use of bacteriocins as an alternative to antibiotics has long been mooted (Piper et al. 2009). The potential benefits of employing bacteriocins in this arena have been particularly apparent in recent times as a consequence of an appreciation of the “collateral damage” which antibiotics can inflict on the commensal microbiota. Narrow spectrum bacteriocins may well address this issue in view of their target specificity. In the area of veterinary medicine, bacteriocins have proven useful in the control of mastitis in cattle and as an additive to animal feed with a view to improving general animal health (Abriouel et al. 2011). It has also been suggested that bacteriocins or bacteriocin-producing microbes could be employed as biocontrol agents which, for example, could be added to soil to control plant pathogens (Abriouel et al. 2011).

Identification of Novel Bacteriocin Gene Clusters

Traditionally, the identification of novel bacteriocin gene clusters has involved using classical microbiology to screen for large collections of strains, using a culture-based assessment of their ability to produce novel antimicrobials (Fig. 2). This is then followed by the subsequent identification of the responsible genes through subcloning, mutagenesis, reverse genetics, or, more recently, sequencing of the corresponding genome. However, in spite of constant improvements in culturing techniques, it is still estimated that just 10–50 % of bacteria are culturable. Fortunately, metagenomic DNA sequencing provides an alternative with respect to identifying novel bacteriocin gene clusters by facilitating an unbiased characterization of entire microbial communities. In particular, recent improvements in sequencing technologies have resulted in a massive increase in sequence data, leading to the development of valuable public databases and annotation pipelines (http://camera.calit2.net/, http://img.jgi.doe.gov/, http://metagenomics.anl.gov/). The generation of vast quantities of DNA sequence data from metagenomics-based projects from varying environments across the globe represents a considerable resource from which new bacteriocin gene clusters can be identified. There are a number of ways in which this information can be harnessed. One example is BACTIBASE, a bacteriocin database and suite of analysis tools established to archive known bacteriocin sequences and enhance the discovery of bacteriocins in genomic data (Hammami et al. 2010). The current release of BACTIBASE contains 177 bacteriocin sequences against which one can test the homology of a query bacteriocin sequence, perform sequence alignments, and predict peptide structure (Hammami et al. 2010). Searches are limited to the known sequences already in the database, and the usefulness of the tool is also affected by the fact that bacteriocin peptides themselves often share little or no homology. A specific bacteriocin mining tool, BAGEL2 (BActeriocin GEnome Location), was established to search for novel bacteriocin sequences in genomic data (de Jong et al. 2010). BAGEL2 has a built-in database of bacteriocin and bacteriocin-related sequences and, in addition to genes encoding the structural bacteriocin peptide, uses genes involved in bacteriocin biosynthesis, regulation, export, and immunity to reveal related genes in novel clusters. Additionally searches can be implemented against finished genome sequences or against novel genomes uploaded by the user. The fact that genes involved in the modification of Class I bacteriocins, such as those generically named lanM, lanB, and lanC or those encoding radical SAMs associated with sactibiotic production, are frequently more highly conserved than the structural genes themselves has also been utilized in recent years to identify Class I gene clusters in genomic and metagenomic databases. During this period targeted searches for bacteriocins in genomic data have resulted in the discovery of several novel active bacteriocins, such as lichenicidin (Begley et al. 2009), and a Streptococcus-associated lantibiotic (Majchrzykiewicz et al. 2010), among others. This strategy parallels similar genome-based approaches which have identified gene clusters encoding other ribosomally synthesized natural products (Velásquez and van der Donk 2011). In addition to the identification of novel bacteriocins, the screening of genomes using the LtnM1 protein of lacticin 3147 (Begley et al. 2009; O’Sullivan et al. 2011) or the radical SAM proteins of thuricin CD (Murphy et al. 2011) as drivers has also revealed several potential bacteriocin-encoding clusters. It is anticipated that many of these will be the focus of further investigation in the coming years.
Fig. 2

Representative agar plate depicting the outcome of a culture-based screen for bacteriocin activity

Identification of Bacteriocins in Metagenomes

The identification of bacteriocins within metagenomic DNA can be performed via laboratory-based or in silico-based approaches. A recent example of the former involved a PCR-based screen to establish the bacteriocin-producing potential within metagenomic DNA sourced from 40 Polish cheeses (Więckowicz et al. 2011). In this case, PCR-primers were designed to exploit conserved sequence motifs within the four anti-listeral bacteriocin peptides, divercin V41, enterocin P, mesenteric in Y105, and bacteriocin 423. It was established that metagenomic DNA for each one of the 40 cheeses yielded a PCR product thereby highlighting the bacteriocin-producing potential of the cheese microbiota (Więckowicz et al. 2011). While laboratory-based screens have considerable potential, the vast information present in metagenomic DNA databases suggests that in silico screening for bacteriocin gene clusters can be a more successful approach.

Recently, two studies have carried out basic homology searches against metagenomes to identify clusters containing lanM genes and potentially encoding novel type II lantibiotics (O’Sullivan et al. 2011), or those possessing trnC-/trnD-like genes which potentially encode novel sactibiotics (Murphy et al. 2011). In both studies, a simple BLAST search against the CAMERA (http://camera.calit2.net) metagenomic databases was implemented and homologous proteins were identified. The lanM search revealed homologs in 11 metagenomes (Table 2). Three of these came from an Indian Ocean metagenome, four from hypersaline lagoon metagenomes from the Galapagos Islands, and one each from a coastal sea water metagenome from the Gulf of Mexico, a farm soil metagenome, a whale fall carcass rib bone metagenome, and a coral reef metagenome (Table 2). Further phylogenetic analysis with the 11 homologs and previously identified bacteriocin-like gene clusters revealed that the homologs from the metagenomes were related to other lanM genes from a wide variety of different microorganisms, thus highlighting the diverse nature of the metagenome-associated genes (O’Sullivan et al. 2011). The search that used trnC-/trnD-like genes as driver sequences yielded 365 TrnC homologs and 151 TrnD homologs in metagenomes from environments as diverse as Waseca soil, a coral reef, and the ocean surrounding the Galapagos Islands (Murphy et al. 2011), again highlighting the presence of bacteriocin-associated genes in metagenomic data.
Table 2

LanM homologs in metagenomic databases from (O’Sullivan et al. 2011)

Protein function



% identity


Lantibiotic-modifying enzyme

Sea water

Indian Ocean



Hypothetical protein

Soil sample

Waseca County, USA



Lantibiotic-modifying enzyme

Whale fall rib carcass

Santa Cruz Basin, USA



Lantibiotic-modifying enzyme

Hypersaline lagoons

Galapagos Islands



Hypothetical protein

Coastal sea water

Gulf of Mexico



Hypothetical protein

Hypersaline lagoons

Galapagos Islands



Hypothetical protein

Open ocean

Indian Ocean



Hypothetical protein

Coral reef

Cook’s Bay, French Polynesia



Hypothetical protein

Open ocean

Indian Ocean



Mersacidin-modifying enzyme

Open ocean

Galapagos Islands



Hypothetical protein

Hypersaline lagoons

Galapagos Islands



Despite the valuable insights provided by these analyses, they failed to identify complete bacteriocin gene clusters. A more suitable analysis tool would allow a homology search with multiple genes (or even an operon) and therefore enhance the possibility of identifying a true bacteriocin cluster. Existing tools for metagenome analysis are in two formats: functional key word search engines, such as those available through the MG-RAST (Glass et al. 2010) and IMG/M (Markowitz et al. 2008) platforms, and homology search engines, such as JCoast (Richter et al. 2008), MetaMine (Bohnebeck et al. 2008), and CAMERA (Sun et al. 2011). Functional searches rely heavily on searching among annotated genes. This is inherently reliant on accurate annotation, and due to the small size and heterogeneous nature of bacteriocin peptides, the corresponding genes are often overlooked or mis-annotated. Homology search tools such as CAMERA and JCoast are single gene search-driven, although JCoast does have a graphical user interface that allows visualization of the surrounding gene neighborhood which would prove particularly useful for screening for the presence of other genes in the bacteriocin operons (Richter et al. 2008). Metamine allows homology searches with “gene neighborhoods”; again this would prove particularly useful for bacteriocin clusters. Metamine searches are, however, restricted to marine metagenomic databases (Bohnebeck et al. 2008). It should also be noted that, as a consequence of the evolution of DNA sequencing technologies, longer stretches of contiguous metagenomic DNA will become available which will further enhance our ability to identify complete bacteriocin gene clusters. Despite this, it must also be noted that the presence of bacteriocin homologs alone is not an indicator of function. Clearly in silico analysis is not sufficient to determine functional presence of a bacteriocin. However, the likelihood that even a proportion of bacteriocin homologues will be deemed functional is an intriguing prospect.

Harnessing Bacteriocin Gene Clusters

While the in silico analysis of newly identified bacteriocin gene clusters within metagenomic DNA can be of great value from a fundamental perspective, the harnessing of the antimicrobial potential of these clusters will undoubtedly become a priority in the future. In the majority of instances, the specific strain from which the fragment of metagenomic DNA has originated will not be available, or may not be culturable, and other strategies will be required. The genetics-based options available can be divided into in vivo and in vitro approaches. Regardless of the approach, specific genes within the cluster will need to be regenerated through DNA synthesis technology. In the case of in vivo harnessing, the DNA fragment(s) will be cloned and expressed heterologously, using approaches such as those employed to facilitate the production of a Streptococcus-associated lantibiotic cluster by Lactococcus lactis (Majchrzykiewicz et al. 2010) and by Escherichia coli. Alternatively, when dealing with modified bacteriocins, one can clone and express individual genes heterologously but then purify them to facilitate the in vitro reconstitution of biosynthesis using the corresponding modification proteins or related enzymes originating from other sources (Knerr and van der Donk 2012). Finally, an alternative non-genetics-based approach, which is available when gene clusters predicted to encode unmodified residues are identified, is to employ peptide synthesis with a view of generating a synthetic equivalent of the natural antimicrobial. It is anticipated that these various options will be widely used in the years to come.


In order to effectively mine metagenomes for bacteriocins, accurate annotation of the datasets is essential. As the volume of data grows, it is anticipated that the precision of annotation tools will improve in tandem. The number of bacteriocin-associated gene homologs present in diverse metagenomic environments suggests the presence of multiple corresponding gene clusters. The further expansion of metagenomic DNA databases will undoubtedly further increase our appreciation of just how widespread, and diverse, these clusters are. As the commercial application of bacteriocins becomes more common (for review see (Cotter et al. 2005)), we can anticipate that we will reap the benefits of in silico screening and harnessing of this untapped reservoir of novel bacteriocins.



  1. Abriouel H, Franz CMAP, Omar NB, Gálvez A. Diversity and applications of Bacillus bacteriocins. FEMS Microbiol Rev. 2011;35(1):201–32. doi:10.1111/j.1574-6976.2010.00244.x.PubMedCrossRefGoogle Scholar
  2. Begley M, Cotter PD, Hill C, Ross RP. Identification of a novel two-peptide lantibiotic, lichenicidin, following rational genome mining for LanM proteins. Appl Environ Microbiol. 2009;75(17):5451–60. doi:10.1128/aem.00730-09.PubMedCentralPubMedCrossRefGoogle Scholar
  3. Bohnebeck U, Lombardot T, Kottmann R, Glockner FO. MetaMine – a tool to detect and analyse gene patterns in their environmental context. BMC Bioinforma. 2008;9:459. doi:10.1186/1471-2105-9-459.CrossRefGoogle Scholar
  4. Cotter PD, Hill C, Ross RP. Bacteriocins: developing innate immunity for food. Nat Rev Micro. 2005;3(10):777–88.CrossRefGoogle Scholar
  5. de Jong A, van Heel AJ, Kok J, Kuipers OP. BAGEL2: mining for bacteriocins in genomic data. Nucleic Acids Res. 2010;38 suppl 2:W647–51. doi:10.1093/nar/gkq365.PubMedCentralPubMedCrossRefGoogle Scholar
  6. Dobson A, Cotter PD, Ross RP, Hill C. Bacteriocin production: a probiotic trait? Appl Environ Microbiol. 2012;78(1):1–6. doi:10.1128/aem.05576-11.PubMedCentralPubMedCrossRefGoogle Scholar
  7. Glass EM, Wilkening J, Wilke A, Antonopoulos D Meyer F Using the metagenomics RAST server (MG-RAST) for analyzing shotgun metagenomes. Cold Spring Harb Protoc. 2010(1), pdb prot5368, doi:2010/1/pdb.prot5368 [pii] 10.1101/pdb.prot5368.Google Scholar
  8. Hammami R, Zouhir A, Le Lay C, Ben Hamida J, Fliss I. BACTIBASE second release: a database and tool platform for bacteriocin characterization. BMC Microbiol. 2010;10:22. doi:10.1186/1471-2180-10-22.PubMedCentralPubMedCrossRefGoogle Scholar
  9. Klaenhammer TR. Bacteriocins of lactic acid bacteria. Biochimie. 1988;70(3):337–49. 0300-9084(88)90206-4.PubMedCrossRefGoogle Scholar
  10. Knerr PJ, van der Donk WA. Discovery, biosynthesis, and engineering of lantipeptides. Annu Rev Biochem. 2012. doi:10.1146/annurev-biochem-060110-113521.PubMedGoogle Scholar
  11. Majchrzykiewicz JA, Lubelski J, Moll GN, Kuipers A, Bijlsma JJ, Kuipers OP, et al. Production of a class II two-component lantibiotic of Streptococcus pneumoniae using the class I nisin synthetic machinery and leader sequence. Antimicrob Agents Chemother. 2010;54(4):1498–505. doi:10.1128/AAC.00883-09.PubMedCentralPubMedCrossRefGoogle Scholar
  12. Markowitz VM, Ivanova NN, Szeto E, Palaniappan K, Chu K, Dalevi D, et al. IMG/M: a data management and analysis system for metagenomes. Nucleic Acids Res. 2008;36 suppl 1:D534–8. doi:10.1093/nar/gkm869.PubMedCentralPubMedGoogle Scholar
  13. Murphy K, O’Sullivan O, Rea MC, Cotter PD, Ross RP, Hill C. Genome mining for radical SAM protein determinants reveals multiple sactibiotic-like gene clusters. PLoS One. 2011;6(7):e20852. doi:10.1371/journal.pone.0020852 PONE-D-11-04704[pii].PubMedCentralPubMedCrossRefGoogle Scholar
  14. O’Sullivan O, Begley M, Ross R, Cotter P, Hill C. Further identification of novel lantibiotic operons using LanM-based genome mining. Probiotics Antimicrob Protein. 2011;3(1):27–40. doi:10.1007/s12602-011-9062-y.CrossRefGoogle Scholar
  15. Piper C, Cotter PD, Ross RP, Hill C. Discovery of medically significant lantibiotics. Curr Drug Discov Technol. 2009;6(1):1–18.PubMedCrossRefGoogle Scholar
  16. Rea M, Cotter P, Hill C, Ross R. Classification of bacteriocins from gram-positive bacteria. In: Drider D, Rebuffat S, editors. Prokaryotic antimicrobial peptides - from genes to applications. New York: Springer; 2011. p. 29.CrossRefGoogle Scholar
  17. Richter M, Lombardot T, Kostadinov I, Kottmann R, Duhaime MB, Peplies J, et al. JCoast - a biologist-centric software tool for data mining and comparison of prokaryotic (meta)genomes. BMC Bioinforma. 2008;9:177. doi:10.1186/1471-2105-9-177.CrossRefGoogle Scholar
  18. Sun S, Chen J, Li W, Altintas I, Lin A, Peltier S, et al. Community cyberinfrastructure for advanced microbial ecology research and analysis: the CAMERA resource. Nucleic Acids Res. 2011;39(Database issue):D546-551. doi:10.1093/nar/gkq1102.Google Scholar
  19. Velásquez JE, van der Donk WA. Genome mining for ribosomally synthesized natural products. Curr Opin Chem Biol. 2011;15(1):11–21.PubMedCentralPubMedCrossRefGoogle Scholar
  20. Więckowicz M, Schmidt M, Sip A, Grajek W. Development of a PCR-based assay for rapid detection of class IIa bacteriocin genes. Lett Appl Microbiol. 2011;52(3):281–9. doi:10.1111/j.1472-765X.2010.02999.x.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Orla O’Sullivan
    • 1
    • 2
    Email author
  • Colin Hill
    • 2
    • 3
  • Paul Ross
    • 1
    • 2
  • Paul Cotter
    • 1
    • 2
  1. 1.Teagasc Food Research Centre, Moorepark, Fermoy, Co.CorkIreland
  2. 2.Alimentary Pharmabiotic CentreUniversity CollegeCorkIreland
  3. 3.Department of MicrobiologyUniversity CollegeCorkIreland