c-GAMMA:Comparative Genome Analysis of Molecular Markers

  • Pierre Peterlongo
  • Jacques Nicolas
  • Dominique Lavenier
  • Raoul Vorc’h
  • Joël Querellou
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5780)

Abstract

Discovery of molecular markers for efficient identification of living organisms remains a challenge of high interest. The diversity of species can now be observed in details with low cost genomic sequences produced by new generation of sequencers. A method, called c-GAMMA, is proposed. It formalizes the design of new markers for such data. It is based on a series of filters on forbidden pairs of words, followed by an optimization step on the discriminative power of candidate markers.

First results are presented on a set of microbial genomes. The importance of further developments are stressed to face the huge amounts of data that will soon become available in all kingdoms of life.

Keywords

Prime Pair Molecular Marker Comparative Genome Analysis Yersinia Pestis Ureaplasma Urealyticum 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Fleischmann, R., Adams, M., White, O., Clayton, R., Kirkness, E., Kerlavage, A., Bult, C., Tomb, J., Dougherty, B., Merrick, J., et al.: Whole-genome random sequencing and assembly of haemophilus influenzae rd. Science 269(5223), 496–512 (1995)CrossRefPubMedGoogle Scholar
  2. 2.
    Koonin, E., Wolf, Y.: Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world. Nucl. Acids Res. 36(21), 6688–6719 (2008)CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Koonin, E.: Darwinian evolution in the light of genomics. Nucl. Acids Res. 37(4), 1011–1034 (2009)CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Cole, J., Wang, Q., Cardenas, E., Fish, J., Chai, B., Farris, R., Kulam-Syed-Mohideen, A., McGarrell, D., Marsh, T., Garrity, G., Tiedje, J.: The ribosomal database project: improved alignments and new tools for rrna analysis. Nucl. Acids Res. 37(suppl. 1), D141–D145 (2009)CrossRefGoogle Scholar
  5. 5.
    Stackebrandt, E., Frederiksen, W., Garrity, G., Grimont, P., Kampfer, P., Maiden, M., Nesme, X., Rossello-Mora, R., Swings, J., Truper, H., Vauterin, L., Ward, A., Whitman, W.: Report of the ad hoc committee for the re-evaluation of the species definition in bacteriology. Int. J. Syst. Evol. Microbiol. 52(3), 1043–1047 (2002)PubMedGoogle Scholar
  6. 6.
    Ratnasingham, S., Hebert, P.: Bold: the barcode of life data system. Mol. Ecol. Notes (2007)Google Scholar
  7. 7.
    Ludwig, W., Strunk, O., Westram, R., Richter, L., Meier, H., Yadhukumar, Buchner, A., Lai, T., Steppi, S., Jobb, G., Forster, W., Brettske, I., Gerber, S., Ginhart, A.W., Gross, O., Grumann, S., Hermann, S., Jost, R., Konig, A., Liss, T., Lubmann, R., May, M., Nonhoff, B., Reichel, B., Strehlow, R., Stamatakis, A., Stuckmann, N., Vilbig, A., Lenke, M., Ludwig, T., Bode, A., Schleifer, K.H.: Arb: a software environment for sequence data. Nuc. Acids Res. 32(4), 1363–1371 (2004)CrossRefGoogle Scholar
  8. 8.
    Pozhitkov, A., Tautz, D.: An algorithm and program for finding sequence specific oligonucleotide probes for species identification. BMC Bioinformatics 3(9) (2002)Google Scholar
  9. 9.
    Kampke, T., Kieninger, M., Mecklenburg, M.: Efficient primer design algorithms. Bioinformatics 17(3), 214–225 (2001)CrossRefPubMedGoogle Scholar
  10. 10.
    Kaderali, L., Schliep, A.: Selecting signature oligonucleotides to identify organisms using DNA arrays. Bioinformatics 18(10), 1340–1349 (2002)CrossRefPubMedGoogle Scholar
  11. 11.
    Lemoine, S., Combes, F., Le Crom, S.: An evaluation of custom microarray applications: the oligonucleotide design challenge. Nuc. Acids Res. 37(6), 1726–1739 (2009)CrossRefGoogle Scholar
  12. 12.
    Wang, J., Li, K., Sung, W.: G-primer: greedy algorithm for selecting minimal primer set. Bioinformatics 20(15), 2473–2475 (2004)CrossRefPubMedGoogle Scholar
  13. 13.
    Liu, Y., Carson, D.: A novel approach for determining cancer genomic breakpoints in the presence of normal DNA. PLoS One 2(4) (2007)Google Scholar
  14. 14.
    Bashir, A., Liu, Y.T., Raphael, B.J., Carson, D., Bafna, V.: Optimization of primer design for the detection of variable genomic lesions in cancer. Bioinformatics 23(21), 2807–2815 (2007)CrossRefPubMedGoogle Scholar
  15. 15.
    SantaLucia, J.J.: A unified view of polymer, dumbbell, and oligonucleotide dna nearest-neighbor thermodynamics. Proc. Natl. Acad. Sci. USA 95(4), 1460–1465 (1998)CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Bailey, T.L., Elkan, C.: Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In: Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28–36 (1994)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Pierre Peterlongo
    • 1
  • Jacques Nicolas
    • 1
  • Dominique Lavenier
    • 2
  • Raoul Vorc’h
    • 1
  • Joël Querellou
    • 3
  1. 1.Équipe-projet INRIA SymbioseRennesFrance
  2. 2.ENS Cachan - IRISAFrance
  3. 3.LM2E UMR6197 Ifremer, Centre de BrestFrance

Personalised recommendations