Formal Languages Arising from Gene Repeated Duplication

  • Peter Leupold
  • Victor Mitrana
  • José M. Sempere
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2950)


We consider two types of languages defined by a string through iterative factor duplications, inspired by the process of tandem repeats production in the evolution of DNA. We investigate some decidability matters concerning the unbounded duplication languages and then fix the place of bounded duplication languages in the Chomsky hierarchy by showing that all these languages are context-free. We give some conditions for the non-regularity of these languages. Finally, we discuss some open problems and directions for further research.


Formal Language Regular Language Mathematical Linguistics Membership Problem Input Tape 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bean, D.R., Ehrenfeucht, A., Mc Nulty, G.F.: Avoidable patterns in strings of symbols. Pacific. J. of Math. 85, 261–294 (1979)zbMATHMathSciNetGoogle Scholar
  2. 2.
    Charlesworth, B., Sniegowski, P., Stephan, W.: The evolutionary dynamics of repetitive DNA in eukaryotes. Nature 371, 215–220 (1994)CrossRefGoogle Scholar
  3. 3.
    Dassow, J., Mitrana, V.: On some operations suggested by the genome evolution. In: Altman, R., Dunker, K., Hunter, L., Klein, T. (eds.) Pacific Symposium on Biocomputing 1997, pp. 97–108 (1997)Google Scholar
  4. 4.
    Dassow, J., Mitrana, V.: Evolutionary grammars: a grammatical model for genome evolution. In: Hofestädt, R., Löffler, M., Schomburg, D., Lengauer, T. (eds.) GCB 1996. LNCS, vol. 1278, pp. 199–209. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  5. 5.
    Dassow, J., Mitrana, V., Salomaa, A.: Context-free evolutionary grammars and the language of nucleic acids. BioSystems 4, 169–177 (1997)CrossRefGoogle Scholar
  6. 6.
    Dassow, J., Mitrana, V.: Self cross-over systems. In: Păun, G. (ed.) Computing with Bio-Molecules, pp. 283–294. Springer, Singapore (1998)Google Scholar
  7. 7.
    Dassow, J., Mitrana, V., P˘aun, G.: On the regularity of duplication closure. Bull EATCS 69, 133–136 (1999)zbMATHMathSciNetGoogle Scholar
  8. 8.
    Head, T.: Formal language theory and DNA: an analysis of the generative capacity of specific recombinant behaviours. Bull. Math. Biology 49, 737–759 (1987)zbMATHMathSciNetGoogle Scholar
  9. 9.
    Head, T., Păun, G., Pixton, D.: Language theory and molecular genetics. Generative mechanisms suggested by DNA recombination. In: [16] (1997)Google Scholar
  10. 10.
    Levinson, G., Gutman, G.: Slipped-strand mispairing: a major mechanism for DNA sequence evolution. Molec. Biol. Evol. 4, 203–221 (1987)Google Scholar
  11. 11.
    Manaster Ramer, A.: Some uses and misuses of mathematics in linguistics. In: Martín-Vide, C. (ed.) Issues from Mathematical Linguistics: A Workshop, pp. 70–130. John Benjamins, Amsterdam (1999)Google Scholar
  12. 12.
    Martín-Vide, C., Păun, G.: Duplication grammars. Acta Cybernetica 14, 101–113 (1999)Google Scholar
  13. 13.
    Ming-wei, W.: On the irregularity of the duplication closure. Bull. EATCS 70, 162–163 (2000)zbMATHGoogle Scholar
  14. 14.
    Mitrana, V., Rozenberg, G.: Some properties of duplication grammars. Acta Cybernetica 14, 165–177 (1999)zbMATHMathSciNetGoogle Scholar
  15. 15.
    Păun, G., Rozenberg, G., Salomaa, A.: DNA Computing. In: New Computing Paradigms. Springer, Berlin (1998)Google Scholar
  16. 16.
    Rozenberg, G., Salomaa, A. (eds.): Handbook of Formal Languages, vol. I-III. Springer, Berlin (1997)zbMATHGoogle Scholar
  17. 17.
    Rounds, W.C., Manaster Ramer, A., Friedman, J.: Finding natural languages a home in formal language theory. In: Manaster Ramer, A. (ed.) Mathematics of Language, pp. 349–360. John Benjamins, Amsterdam (1987)Google Scholar
  18. 18.
    Schlotterer, C., Tautz, D.: Slippage synthesis of simple sequence DNA. Nucleic Acids Res. 20, 211–215 (1992)CrossRefGoogle Scholar
  19. 19.
    Searls, D.B.: The computational linguistics of biological sequences. In: Hunter, L. (ed.) Artificial Intelligence and Molecular Biology, pp. 47–120. AAAI Press/MIT Press, Menlo Park, CA/Cambridge, MA (1993)Google Scholar
  20. 20.
    Strand, M., Prolla, T., Liskay, R., Petes, T.: Destabilization of tracts of simple repetitive DNA in yeast by mutations affecting DNA mismatch repair. Nature 365, 274–276 (1993)CrossRefGoogle Scholar
  21. 21.
    Thue, A.: Uber unendliche Zeichenreihen. Norske Videnskabers Selskabs Skrifter Mat.-Nat. Kl (Kristiania) 7, 1–22 (1906)Google Scholar
  22. 22.
    Thue, A.: Uber die gegenseitige Lage gleicher Teile gewiisser Zeichenreihen. Norske Videnskabers Selskabs Skrifter Mat.-Nat. Kl (Kristiania) 1, 1–67 (1912)Google Scholar
  23. 23.
    Weitzmann, M., Woodford, K., Usdin, K.: DNA secondary structures and the evolution of hyper-variable tandem arrays. J. of Biological Chemistry 272, 9517–9523 (1997)CrossRefGoogle Scholar
  24. 24.
    Wells, R.: Molecular basis of genetic instability of triplet repeats. J. of Biological Chemistry 271, 2875–2878 (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Peter Leupold
    • 1
  • Victor Mitrana
    • 2
    • 1
  • José M. Sempere
    • 3
  1. 1.Research Group on Mathematical LinguisticsRovira i Virgili UniversityTarragonaSpain
  2. 2.Faculty of Mathematics and Computer ScienceBucharest UniversityBucureştiRomania
  3. 3.Departamento de Sistemas Informáticos y ComputaciónUniversidad Politécnica de ValenciaValenciaSpain

Personalised recommendations