A Compression Algorithm as a Complexity Measure on DNA Sequences

  • Giulia Menconi
Chapter

Abstract

A new compression method has been used to prove the existence of long range correlated repetitive sequences in some complete genomes within the three domains of life. We defined the computable complexity of a sequence. The consequent complexity analysis both allowed to distinguish the functional regions of the genome and to identify the lowest complex regions which match with noncoding regions.

Keywords

Information Content Complete Genome Repetitive Sequence Complexity Measure Noncoding Region 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Allison L., Stern L., Edgoose T., Dix T. I., Sequence complexity for biological sequence analysis, Comput. Chem. 24 (2000), 43–55.Google Scholar
  2. [2]
    Argenti F., Benci V., Cerrai P., Cordelli A., Galatolo S., Menconi G., Information and dynamical systems: a concrete measurement on sporadic dynamics, Chaos, Solitons and Fractals 13 (2002), 461–469.CrossRefGoogle Scholar
  3. [3]
    Benci V., Bonanno C., Galatolo S., Menconi G., Ponchio F., Information, complexity and entropy: a new approach to theory and measurement methods, http://www.mathpreprints.com (2001).Google Scholar
  4. [4]
    Bonanno C., Menconi G., Computational information for the logistic map at the chaos threshold, arXiv E-print no. nlin.CD/0102034 (2001).Google Scholar
  5. [5]
    Chaitin G.J., Information, randomness and incompleteness. Papers on algorithmic information theory, World Scientific, Singapore 1987.CrossRefGoogle Scholar
  6. [6]
    Gusev V.D., Nemytikova L.A., Chuzhanova N.A., On the complexity measures of genetic sequences, Bioinformatics 15 (1999), 994–999.PubMedCrossRefGoogle Scholar
  7. [7]
    Lempel A., Ziv J., Compression of individual sequences via variable-rate coding, IEEE Transactions on Information Theory IT 24 (1978), 530–536.CrossRefGoogle Scholar
  8. [8]
    Milosavljevic A., Jurka J., Discovering simple DNA sequences by the algorithmic significance method, CABIOS 9 (1993), 407–411.PubMedGoogle Scholar
  9. [9]
    Rivals E., Delgrange O., Delahaye J.-P., Dauchet M., Delorme M.O., Henaut A., Ollivier E., Detection of significant patterns by copression algorithms: the case of approximate tandem repeats in DNA sequences, CABIOS 13 (1997), 131–136.PubMedGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2003

Authors and Affiliations

  • Giulia Menconi
    • 1
  1. 1.Centro Interdisciplinare per lo Studio dei Sistemi ComplessiUniversità di PisaPisaItaly

Personalised recommendations