Abstract
We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range—indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the “non-stationarity” feature of the sequence of base pairs by systematically applying many methods, including a new algorithm called Detrended Fluctuation Analysis (DFA). We refute the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to all eukaryotic DNA sequences in the entire GenBank database with more than 512 base pairs (33 301 coding and 29 453 noncoding sequences). We describe a simple model to account for the presence of long-range power-law correlations which is based upon a generalization of the classic Lévy walk. Finally, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the “redundancy” of a linguistic text in terms of a measurable entropy function. We systematically compare coding and noncoding regions, and find differences whose possible significance is a topic of current study.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A. Bunde, S. Havlin, eds.: Fractals and Disordered Systems (Springer-Verlag, Berlin 1991);
A. Bunde, S. Havlin, eds.: Fractals in Science (Springer-Verlag, Berlin 1994)
T. Vicsek, M. Shlesinger, M. Matsushita, eds.: Fractals in Natural Sciences (World Scientific, Singapore, 1994)
J.M. Garcia-Ruiz, E. Louis, P. Meakin, L. Sander, eds.: Growth Patterns in Physical Sciences and Biology [Proc. 1991 NATO Advanced Research Workshop, Granada, Spain, October 1991], (Plenum, New York, 1993)
A.Yu. Grosberg, A.R. Khokhlov: Statistical Physics of Macromolecules, translated by Y. A. Atanov (AIP Press, New York, 1994)
J.B. Bassingthwaighte, L.S. Liebovitch, B.J. West: Fractal Physiology (Oxford University Press, New York, 1994)
A.-L. Barabási, H.E. Stanley: Fractal Concepts in Surface Growth (Cambridge University Press, Cambridge, 1995)
B.J. West, A.L. Goldberger: J. Appl. Physiol., 60, 189 (1986);
B.J. West, A.L. Goldberger: Am. Sci., 75, 354 (1987);
A.L. Goldberger, B.J. West: Yale J. Biol. Med. 60, 421 (1987);
A.L. Goldberger, D.R. Rigney, B.J. West: Sci. Am. 262, 42 (1990);
B.J. West, M.F. Shlesinger: Am. Sci. 78, 40 (1990);
B.J. West: Fractal Physiology and Chaos in Medicine (World Scientific, Singapore 1990);
B.J. West, W. Deering: Phys. Reports 246, 1 (1994);
S.V. Buldyrev, A.L. Goldberger, S. Havlin, C.-K. Peng, H.E. Stanley: in Fractals in Science, edited by A. Bunde and S. Havlin (Springer-Verlag, Berlin, 1994), pp. 49–83
T. Vicsek: Fractal Growth Phenomena, Second Edition (World Scientific, Singapore 1992)
J. Feder: Fractals (Plenum, New York 1988)
D. Stauffer, H.E. Stanley: From Newton to Mandelbrot: A Primer in Theoretical Physics, Second Edition (Springer-Verlag, Heidelberg & New York 1995)
E. Guyon, H.E. Stanley: Les Formes Fractales (Palais de la Découverte, Paris 1991);
English translation: Fractal Forms (Elsevier North Holland, Amsterdam 1991)
H.E. Stanley, N. Ostrowsky, eds.: Random Fluctuations and Pattern Growth: Experiments and Models, Proceedings 1988 Cargèse NATO ASI (Kluwer Academic Publishers, Dordrecht, 1988)
H.E. Stanley: Introduction to Phase Transitions and Critical Phenomena (Oxford University Press, London 1971)
H.E. Stanley, N. Ostrowsky, eds.: Correlations and Connectivity: Geometric Aspects of Physics, Chemistry and Biology, Proceedings 1990 Cargèse Nato ASI, Series E: Applied Sciences (Kluwer, Dordrecht 1990)
C.-K. Peng, S.V. Buldyrev, A.L. Goldberger, S. Havlin, F. Sciortino, M. Simons, H.E. Stanley: Nature 356, 168 (1992)
W. Li, K. Kaneko: Europhys. Lett. 17, 655 (1992)
S. Nee: Nature 357, 450 (1992)
R. Voss: Phys. Rev. Lett. 68, 3805 (1992);
R. Voss: Fractals 2, 1 (1994)
J. Maddox: Nature 358, 103 (1992)
P.J. Munson, R.C. Taylor, G.S. Michaels: Nature 360, 636 (1992)
I. Amato: Science 257, 747 (1992)
V.V. Prabhu, J.-M. Claverie: Nature 357, 782 (1992)
P. Yam: Sci. Am. 267[3], 23 (1992)
C.-K. Peng, S.V. Buldyrev, A.L. Goldberger, S. Havlin, F. Sciortino, M. Simons, H.E. Stanley: Physica A 191, 25 (1992);
H.E. Stanley, S.V. Buldyrev, A.L. Goldberger, J.M. Hausdorff, S. Havlin, J. Mietus, C.-K. Peng, F. Sciortino, M. Simons: Physica A 191, 1 (1992)
H. E. Stanley, S. V. Buldyrev, A. L. Goldberger, S. Havlin, R. N. Mantegna, S. M. Ossadnik, C.-K. Peng, F. Sciortino and M. Simons, “Fractals in Biology and Medicine,” in Diffusion Processes: Experiment, Theory, Simulations Proceedings of the Vth M. Born Symposium], edited by A. Pekalski (Springer-Verlag, Berlin, 1994), pp. 147–178;
H. E. Stanley, S. V. Buldyrev, A. L. Goldberger, Z. D. Goldberger, S. Havlin, R. N. Mantegna, S. M. Ossadnik, C.-K. Peng and M. Simons, “Statistical Mechanics in Biology: How Ubiquitous are Long-Range Correlations?” Proc. International Conference on Statistical Mechanics, Physica A 205, 214 (1994).
C.A. Chatzidimitriou-Dreismann, D. Larhammar: Nature 361, 212 (1993);
D. Larhammar, C.A. Chatzidimitriou-Dreismann: Nucleic Acids Res. 21, 5167 (1993)
C.A. Chatzidimitriou-Dreismann, R.M.F. Streffer, D. Larhammar: Biochim. Biophys. Acta 1217, 181 (1994
C.A. Chatzidimitriou-Dreismann, R.M.F. Streffer, D. Larhammar: Eur. J. Biochem. 224, 365 (1994)
A.Yu. Grosberg, Y. Rabin, S. Havlin, A. Neer: Europhys. Lett. 23, 373 (1993)
S. Karlin, V. Brendel: Science 259, 677 (1993)
C.-K. Peng, S.V. Buldyrev, A.L. Goldberger, S. Havlin, M. Simons, H.E. Stanley: Phys. Rev. E 47, 3730 (1993)
N. Shnerb, E. Eisenberg: Phys. Rev. E 49, R1005 (1994)
S.V. Buldyrev, A.L. Goldberger, S. Havlin, C.-K. Peng, M. Simons, H.E. Stanley: Phys. Rev. E 47, 4514 (1993)
A. S. Borovik, A. Yu. Grosberg and M. D. Frank-Kamenetski, J. Biomolec. Structure and Dynamics 12, 655 (1994);
V. Pande, A. Yu. Grosberg and T. Tanaka P.N.A.S. 91, 12972 (1994)
S.V. Buldyrev, A.L. Goldberger, S. Havlin, C.-K. Peng, H.E. Stanley, M.H.R. Stanley, M. Simons: Biophys. J. 65, 2673 (1993)
C.-K. Peng, S.V. Buldyrev, S. Havlin, M. Simons, H.E. Stanley, A.L. Goldberger: Phys. Rev. E 49, 1685 (1994)
S.M. Ossadnik, S.V. Buldyrev, A.L. Goldberger, S. Havlin, R.N. Mantegna, C-K. Peng, M. Simons, H.E. Stanley: Biophys. J. 67, 64 (1994);
H.E. Stanley, S.V. Buldyrev, A.L. Goldberger, S. Havlin, C.-K. Peng, M. Simons: [Proceedings of Internat’l Conf. on Condensed Matter Physics, Bar-Ilan], Physica A 200, 4 (1993);
H.E. Stanley, S.V. Buldyrev, A.L. Goldberger, S. Havlin, S.M. Ossadnik, C.-K. Peng, M. Simons: Fractals 1, 283–301 (1993);
S. Havlin, S. V. Buldyrev, A. L. Goldberger, R. N. Mantegna, S. M. Ossadnik, C.-K. Peng, M. Simons, and H. E. Stanley, Chaos, Solitons, and Fractals 6, 171 (1995).
S.V. Buldyrev, A.L. Goldberger, S. Havlin, R.N. Mantegna, M.E. Matsa, C.-K. Peng, M. Simons, and H.E. Stanley, “Long-Range Correlation Properties of Coding and Noncoding DNA Sequences,” Phys. Rev. E 51, 5084 (1995)
R.N. Mantegna, S.V. Buldyrev, A.L. Goldberger, S. Havlin, C.-K. Peng, M. Simons, H.E. Stanley: Phys. Rev. Lett. 73, 3169 (1994);
F. Flam: Science 266, 1320 (1994);
E. Pennisi: Science News 146, 391 (1994);
P. Yam: Scientific American 272 [3], 24 (1995)
S. Tavaré, B.W. Giddings, in: Mathematical Methods for DNA Sequences, Eds. M.S. Waterman (CRC Press, Boca Raton 1989), pp. 117–132;
J.D. Watson, M. Gilman, J. Witkowski, M. Zoller: Recombinant DNA (Scientific American Books, New York 1992).
E.W. Montroll, M.F. Shlesinger: “The Wonderful World of Random Walks” in: Nonequilibrium Phenomena II. From Stochastics to Hydrodynamics, ed. by J.L. Lebowitz, E.W. Montroll (North-Holland, Amsterdam 1984), pp. 1–121
G.H. Weiss: Random Walks (North-Holland, Amsterdam 1994)
S. Havlin, R. Selinger, M. Schwartz, H.E. Stanley, A. Bunde: Phys. Rev. Lett. 61, 1438 (1988);
S. Havlin, M. Schwartz, R. Blumberg Selinger, A. Bunde, H.E. Stanley: Phys. Rev. A 40, 1717 (1989);
R.B. Selinger, S. Havlin, F. Leyvraz, M. Schwartz, H.E. Stanley: Phys. Rev. A 40, 6755 (1989)
C.-K. Peng, S. Havlin, M. Schwartz, H.E. Stanley, G.H. Weiss: Physica A 178, 401 (1991);
C.-K. Peng, S. Havlin, M. Schwartz, H.E. Stanley: Phys. Rev. A 44, 2239 (1991)
M. Araujo, S. Havlin, G.H. Weiss, H.E. Stanley: Phys. Rev. A 43, 5207 (1991);
S. Havlin, S.V. Buldyrev, H.E. Stanley, G.H. Weiss: J. Phys. A 24, L925 (1991);
S. Prakash, S. Havlin, M. Schwartz, H.E. Stanley: Phys. Rev. A 46, R1724 (1992)
M. Y. Azbel: Phys. Rev. Lett. 31, 589 (1973).
C.L. Berthelsen, J.A. Glazier, M.H. Skolnick: Phys. Rev. A 45, 8902 (1992)
M. Y. Azbel: Biopolymers 21, 1687 (1982).
A. Arneodo, E. Bacry, P.V. Graves, J.F. Mugy: Phys. Rev. Lett. 74, 3293 (1995)
E.C. Uberbacher, R.J. Mural: Proc. Natl. Acad. Sci. USA 88, 11261 (1991)
J. Jurka, T. Walichiewicz, A. Milosavljevic: J. Mol. Evol. 35, 286 (1992)
M.F. Shlesinger, J. Klafter: in On Growth and Form: Fractal and Non-Fractal Patterns in Physics, edited by H.E. Stanley and N. Ostrowsky (Martinus Nijhof F, Dordrecht, 1986), p. 279ff
M.F. Shlesinger, J. Klafter, Y.M. Wong: J. Stat. Phys. 27, 499 (1982)
M.F. Shlesinger, J. Klafter: Phys. Rev. Lett. 54, 2551 (1985)
R.N. Mantegna: Physica A 179, 232 (1991)
J. Jurka: J. Mol. Evol. 29, 496 (1989)
R.H. Hwu, J.W. Roberts, E.H. Davidson, R.J. Britten: Proc. Natl. Acad. Sci. USA. 83, 3875 (1986)
E. Zuckerkandl, G. Latter, J. Jurka: J. Mol. Evol. 29, 504 (1989)
B. Levin: Genes IV (Oxford University Press, Oxford, 1990)
P.-G. de Gennes: Scaling Concepts in Polymer Physics (Cornell University Press, Ithaca NY, 1979)
J. de Cloiseaux: J. Physique (Paris) 41, 223 (1980), p. 223
S. Redner: J. Phys. A 13, 3525 (1980)
A. Baumgartner: Z. Phys. B 42, 265 (1981)
T. M. Birshtein, S. V. Buldyrev: Polymer 32, 3387 (1991)
A. Schenkel, J. Zhang, Y-C. Zhang: Fractals 1, 47 (1993);
M. Amit, Y. Shmerler, E. Eisenberg, M. Abraham, N. Shnerb: Fractals 2, 7 (1994);
W. Ebeling, A. Neiman: Physica A 215, 233 (1995)
G.K. Zipf: Human Behavior and the Principle of “Least Effort” (Addison-Wesley, New York 1949)
L. Brillouin: Science and Information Theory (Academic Press, New York 1956)
C.E. Shannon: Bell Systems Tech. J. 80, 50 (1951);
H. Herzel, A. O. Schmitt, and W. Ebeling, Chaos, Solit. and Fractals 4, 97 (1994);
H. Herzel, W. Ebeling, and A. O. Schmitt, Phys. Rev. E 50, 5061 (1994);
H. Herzel and I. Große, Physica A xx, xxx (1995).
R.N. Mantegna, S.V. Buldyrev, A.L. Goldberger, S. Havlin, C.-K. Peng, M. Simons, H.E. Stanley: Phys. Rev. E (submitted);
S.V. Buldyrev, A.L. Goldberger, S. Havlin, R. N. Mantegna, C.-K. Peng, M. Simons, H.E. Stanley: Phys. Rev. E (submitted).
A. Czirók, R. N. Mantegna, S. Havlin and H. E. Stanley, “Correlations in Binary Sequences and Generalized Zipf Analysis,” Phys. Rev. E 52, xxx (1 July 1995).
S. Havlin: “Distance between Zipf plots,” Physica A (in press)
J.-P. Bouchaud: “More Levy Distributions in Physics,” in Proc. 1993 International Conf. on Levy Flights, edited by M. F. Shlesinger, G. Zaslavsky, and U. Frisch (Springer, Berlin, 1995).
M.H.R. Stanley: 1994 Westinghouse Report (unpublished);
H. E. Stanley, S. V. Buldyrev, A. L. Goldberger, S. Havlin, R. N. Mantegna, C.-K. Peng, M. Simons, and M. H. R. Stanley, “Long-Range Correlatins and Generalized Levy Walks in DNA Sequences,” in Proc. 1993 International Conf. on Levy Flights, edited by M. F. Shlesinger, G. Zaslavsky, and U. Frisch (Springer, Berlin 1995).
M.H.R. Stanley, S.V. Buldyrev, S. Havlin, R. Mantegna, M.A. Salinger, H.E. Stanley: “Zipf plots and the size distribution of Firms,” Economics Lett. 49, xxx (October 1995).
See also R. N. Mantegna and H. E. Stanley, “Ultra-Slow Convergence to a Gaussian: The Truncated Lévy Flight,” in Proc. 1993 International Conf. on Lévy Flights, edited by M. F. Shlesinger, G. Zaslavsky, and U. Frisch (Springer, Berlin 1995);
R. N. Mantegna and H. E. Stanley, “Scaling and Intermittency in the Mesoscopic Dynamics of an Economic Index,” Nature 375, xxx (July 1995).
C.-K. Peng, J. Mietus, J. Hausdorff, S. Havlin, H. E. Stanley, and A. L. Goldberger, Phys. Rev. Lett. 70, 1343 (1993);
C. K. Peng, S. V. Buldyrev, J. M. Hausdorff, S. Havlin, J. E. Mietus, M. Simons, H. E. Stanley, and A. L. Goldberger, in Fractals in Biology and Medicine, G. A. Losa, T. F. Nonnenmacher and E. R. Weibel, eds. (Birkhauser Verlag, Boston, 1994).
See also G. M. Viswanathan, S. V. Buldyrev, H. E. Stanley, V. Afanasyev, and P. A. Prince, “Temporal Scale-In variance in Avian Behavior,” Phys. Rev. E (submitted);
E. Canessa and A. Calmetta, Phys. Rev. E 50, R47 (1994).
C. K. Peng, S. Havlin, H. E. Stanley, and A. L. Goldberger, “Quantification of Scaling Exponents and Crossover Phenomena in Nonstationary Heartbeat Time Series,” [Proc. NATO Dynamical Disease Conference], edited by L. Glass [Chaos 5, 82 (1995)];
C. K. Peng, J. M. Hausdorff, J. E. Mietus, S. Havlin, H. E. Stanley, and A. L. Goldberger, “Fractals in Physiological Control: From Heartbeat to Gait,” in Proc. 1993 International Conf. on Lévy Flights, edited by U. Frisch, M. F. Shlesinger, and G. Zaslavsky (Springer, Berlin, 1995).
J. M. Hausdorff, C.-K. Peng, Z. Ladin, J. Y. Wei, and A. L. Goldberger, J. Appl. Physiol. 78, 349 (1995).
W. B. Cannon, Physiol. Rev. 9, 399 (1929).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1996 Kluwer Academic Publishers
About this chapter
Cite this chapter
Stanley, H.E. et al. (1996). Statistical and Linguistic Features of DNA Sequences. In: Riste, T., Sherrington, D. (eds) Physics of Biomaterials: Fluctuations, Selfassembly and Evolution. NATO ASI Series, vol 322. Springer, Dordrecht. https://doi.org/10.1007/978-94-009-1722-4_9
Download citation
DOI: https://doi.org/10.1007/978-94-009-1722-4_9
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-010-7271-7
Online ISBN: 978-94-009-1722-4
eBook Packages: Springer Book Archive