Abstract
A combination of gene loss and acquisition through horizontal gene transfer (HGT) is thought to drive Streptococcus thermophilus adaptation to its niche, i.e. milk. In this study, we describe an in silico analysis combining a stochastic data mining method, analysis of homologous gene distribution and the identification of features frequently associated with horizontally transferred genes to assess the proportion of the S. thermophilus genome that could originate from HGT. Our mining approach pointed out that about 17.7% of S. thermophilus genes (362 CDSs of 1,915) showed a composition bias; these genes were called ‘atypical’. For 22% of them, their functional annotation strongly support their acquisition through HGT and consisted mainly in genes encoding mobile genetic recombinases, exopolysaccharide (EPS) biosynthesis enzymes or resistance mechanisms to bacteriophages. The distribution of the atypical genes in the Firmicutes phylum as well as in S. thermophilus species was sporadic and supported the HGT prediction for more than a half (52%, 189). Among them, 46 were found specific to S. thermophilus. Finally, by combining our method, gene annotation and sequence specific features, new genome islands were suggested in the S. thermophilus genome.
Similar content being viewed by others
References
Ammann A, Neve H, Geis A, Heller KJ (2008) Plasmid transfer via transduction from Streptococcus thermophilus to Lactococcus lactis. J Bacteriol 190:3083–3087
Angel CS, Ruzek M, Hostetter MK (1994) Degradation of C3 by Streptococcus pneumoniae. J Infect Dis 170:600–608
Baum LE, Petrie T, Soules G, Weiss N (1970) A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains. Ann Math Stat 41:164–171
Blomqvist T, Steinmoen H, Havarstein LS (2006) Natural genetic transformation: a novel tool for efficient genetic engineering of the dairy bacterium Streptococcus thermophilus. Appl Environ Microbiol 72:6751–6756
Bourgoin F, Pluvinet A, Gintz B, Decaris B, Guedon G (1999) Are horizontal transfers involved in the evolution of the Streptococcus thermophilus exopolysaccharide synthesis loci? Gene 233:151–161
Brochet M, Couve E, Glaser P, Guedon G, Payot S (2008) Integrative conjugative elements and related elements are major contributors to the genome diversity of Streptococcus agalactiae. J Bacteriol 190:6913–6917
Burrus V, Pavlovic G, Decaris B, Guedon G (2002) The ICESt1 element of Streptococcus thermophilus belongs to a large family of integrative and conjugative elements that exchange modules and change their specificity of integration. Plasmid 48:77–97
Delorme C, Poyart C, Ehrlich SD, Renault P (2007) Extent of horizontal gene transfer in evolution of Streptococci of the salivarius group. J Bacteriol 189:1330–1341
Doolittle WF (1999) Phylogenetic classification and the universal tree. Science 284:2124–2129
Du Preez JA (1998) Efficient training of high-order hidden Markov model using first-order representations. Comput Speech Lang 12:23–39
Eng C, Asthana C, Aigle B, Hergalant S, Mari JF, Leblond P (2009) A new data mining approach for the detection of bacterial promoters combining stochastic and combinatorial methods. J Comput Biol 16:1211–1225
Fernandez A, Thibessard A, Borges F, Gintz B, Decaris B, Leblond-Bourget N (2004) Characterization of oxidative stress-resistant mutants of Streptococcus thermophilus CNRZ368. Arch Microbiol 182:364–372
Fontaine L et al (2007) Quorum-sensing regulation of the production of Blp bacteriocins in Streptococcus thermophilus. J Bacteriol 189:7195–7205
Fontaine L et al (2010) A novel pheromone quorum-sensing system controls the development of natural competence in Streptococcus thermophilus and Streptococcus salivarius. J Bacteriol 192:1444–1454
Garcia-Vallve S, Guzman E, Montero MA, Romeu A (2003) HGT-DB: a database of putative horizontally transferred genes in prokaryotic complete genomes. Nucleic Acids Res 31:187–189
He Y (1988) Extended Viterbi algorithm for second-order hidden Markov process. Proc IEEE Int Conf Pattern Recognit 2:718–720
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86
Layec S, Decaris B, Leblond-Bourget N (2008) Diversity of firmicutes peptidoglycan hydrolases and specificities of those involved in daughter cell separation. Res Microbiol 159:507–515
Le Ber F, Benoît M, Schott C, Mari JF, Mignolet C (2006) Studying crop sequences with carrotage, a HMM-based data mining software. Ecol Modell 191:170–185
Liu M, Siezen RJ, Nauta A (2009) In silico prediction of horizontal gene transfer events in Lactobacillus bulgaricus and Streptococcus thermophilus reveals protocooperation in yogurt manufacturing. Appl Environ Microbiol 75(12):4120–4129
Mari J-F, Haton J-P, Kriouile A (1997) Automatic word recognition based on second-order hidden Markov models. IEEE Trans. Speech Audio Process 5:22–25
Nakamura Y, Itoh T, Matsuda H, Gojobori T (2004) Biased biological functions of horizontally transferred genes in prokaryotic genomes. Nat Genet 36:760–766
Nicolas P et al (2002) Mining Bacillus subtilis chromosome heterogeneities using hidden Markov models. Nucleic Acids Res 30:1418–1426
Nielsen H, Engelbrecht J, Brunak S, von Heijne G (1997) A neural network method for identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Int J Neural Syst 8:581–599
Pavlovic G, Burrus V, Toulmay A, Choulet F, Decaris B, Guedon G (2004) Characterization and evolution of a family of integrative and potentially conjugative or mobilizable elements from Streptococcus thermophilus. Lait 84:7–14
Rasmussen TB, Danielsen M, Valina O, Garrigues C, Johansen E, Pedersen MB (2008) Streptococcus thermophilus core genome: comparative genome hybridization study of 47 strains. Appl Environ Microbiol 74:4703–4710
Rocha EP, Danchin E (2002) Base composition bias might result from competition for metabolic resources. Trends Genet 18:291–294
Rutherford K et al (2000) Artemis: sequence visualization and annotation. Bioinformatics 16:944–945
Sonnhammer EL, von Heijne G, Krogh A (1998) A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol 6:175–182
Vernikos GS, Parkhill J (2006) Interpolated variable order motifs for identification of horizontally acquired DNA: revisiting the Salmonella pathogenicity islands. Bioinformatics 22:2196–2203
Waack S et al (2006) Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models. BMC Bioinformatics 7:142
Yoon SH, Hur CG, Kang HY, Kim YH, Oh TK, Kim JF (2005) A computational approach for identifying pathogenicity islands in prokaryotic genomes. BMC Bioinformatics 6:184
Author information
Authors and Affiliations
Corresponding authors
Additional information
Communicated by Erko Stackebrandt.
Catherine Eng and Annabelle Thibessard contributed equally to this work.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Eng, C., Thibessard, A., Danielsen, M. et al. In silico prediction of horizontal gene transfer in Streptococcus thermophilus. Arch Microbiol 193, 287–297 (2011). https://doi.org/10.1007/s00203-010-0671-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00203-010-0671-8