Skip to main content

Randomness and Complexity

  • Chapter
  • First Online:
Bioinformatics

Part of the book series: Computational Biology ((COBO,volume 21))

  • 5908 Accesses

Abstract

This chapter begins with the tricky concept of randomness. Since a random sequence (e.g., of DNA) represents the null hypothesis for many propositions concerning purported regularities, it is of fundamental importance to master randomness—in so far as it can in principle be mastered. The chapter then moves on to random processes, and the powerful approach of the Markov chain. That leads naturally into consideration of random walks. Noise, already introduced as a disturbance in Chap. 3, is given further consideration. The second part of the chapter deals with complexity. The quantification of complexity may be useful for a number of topics within bioinformatics; for example, everybody is supposed to know that phenotypic complexity has gradually increased during the history of life on Earth, although there appears to be no comprehensive quantitative evidence for it.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    An obvious corollary of this association of randomness with algorithmic compressibility is that there is an intrinsic absurdity in the notion of an algorithm for generating random numbers, such as those included with many compilers and other software packages. These computer-generated pseudorandom numbers generally pass the usual statistical tests for randomness, but little is known about how their nonrandomness affects results obtained using them. Quite possibly the best heuristic sources of (pseudo)random digits are the successive digits of irrational numbers like \(\pi \) or \(\sqrt{2}\). These can be generated by a deterministic algorithm and, of course, are always the same, but in the sense that one cannot jump to (say) the hundredth digit without computing those preceding it, they do fulfil the criteria of haphazardness.

  2. 2.

    After Volchan (2002).

  3. 3.

    von Mises called the random sequences in accord with this notion “collectives”. It was subsequently shown that the collectives were not random enough (see Volchan (2002) for more details); for example, the number \(0.0123456789101112131415161718192021\ldots \) satisfied von Mises’ criteria but is clearly computable.

  4. 4.

    The Kolmogorov-Chaitin definition of the descriptive or algorithmic complexity K(s) of a symbolic sequence s with respect to a machine M running a program P is given by

    (6.1)

    This means that K(s) is the size of the smallest input program P that prints s and then stops when input into M. In other words, it is the length of the shortest (binary) program that describes (codifies) s. Insofar as M is usually taken to be a universal Turing machine, definition is machine-independent.

  5. 5.

    In some of the literature, one finds stochastic matrices arranged such that the columns rather than the rows sum to unity. The arrow in the top left-hand corner serves to indicate which convention is being used.

  6. 6.

    As for the transition matrix for a zeroth-order chain (i.e., independent trials).

  7. 7.

    See Billingsley (1961), especially for the proof of Whittle’s formula, Eq. (6.9).

  8. 8.

    Fick’s first law is

    $$\begin{aligned} J_i = -D_i \nabla c_i \;, \end{aligned}$$
    (6.15)

    where J is the flux of substance i across a plane and c is its (position-dependent) concentration. In one dimension, this law simply reduces to \(J = -D \partial c(x)/ \partial x\), where x is the spatial coordinate. In most cases, especially in the crowded milieu of a living cell, it is more appropriate to use the (electro)chemical potential \(\mu \) than the concentration, whereupon the law becomes

    $$\begin{aligned} J_i = -D_i \nabla \mu _i (c_i/k_B T) \end{aligned}$$
    (6.16)

    where T is the absolute temperature. Fick’s second law, appropriate for time-varying concentrations, is

    $$\begin{aligned} \partial c / \partial t = D \nabla ^2 c \;. \end{aligned}$$
    (6.17)

    If D itself changes with position (e.g., the diffusivity of a protein depends on the local concentration of small ions surrounding it), then we have

    $$\begin{aligned} \partial c / \partial t = \nabla \cdot (D \nabla c ) \;. \end{aligned}$$
    (6.18)

    .

  9. 9.

    If this is so, it then seems rather strange that so much ingenuity is expended by presumably complex people to make their environments more uniform and unchanging, in which case they will tend to lose their competitive advantage.

  10. 10.

    Many considerations of complexity may be reduced to the problem of printing out a number. Thus, the complexity of a protein structure is related to the number specifying the positions of the atoms, or dihedral angles of the peptide groups, which is equivalent to selecting one from a list of all possible conformations; the difficulty of doing that is roughly the same as that of printing out the largest number in that list.

  11. 11.

    Cf. the nursery rhyme Humpty Dumpty sat on a wall/Humpty Dumpty had a great fall/And all the king’s horses and all the king’s men/Couldn’t put Humpty together again. It follows that Humpty Dumpty had great depth, hence complexity.

  12. 12.

    If there is no environment, then all strings have the maximum complexity, \(K_\mathrm{max}\).

  13. 13.

    Due to Bennett (1988).

References

  • Adami C, Cerf NJ (2000) Physical complexity of symbolic sequences. Phys D 137:62–69

    Google Scholar 

  • Bennett CH (1988) Logical depth and physical complexity. In: Herken R (ed) The Universal Turing Machine—A Half-Century Survey. Oxford, University Press, pp 227–257

    Google Scholar 

  • Billingsley P (1961) Statistical methods in Markov chains. Ann Math Statist 32:12–40

    Article  MATH  MathSciNet  Google Scholar 

  • Grassberger P (1986) Toward a quantitative theory of self-generated complexity. Int J Theor Phys 25:907–938

    Article  MATH  MathSciNet  Google Scholar 

  • Lempel A, Ziv J (1976) On the complexity of finite sequences. IEEE Trans Info Theory IT–22:75–81

    Google Scholar 

  • Lloyd S, Pagels H (1988) Complexity as thermodynamic depth. Ann Phys 188:186–213

    Article  MathSciNet  Google Scholar 

  • van der Waerden BL (1927) Beweis einer Baudet’schen Vermutung. Nieuw Arch Wiskunde 15:212–216

    MATH  MathSciNet  Google Scholar 

  • Volchan SB (2002) What is a random sequence? Am Math Monthly 109:46–63

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeremy Ramsden .

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag London

About this chapter

Cite this chapter

Ramsden, J. (2015). Randomness and Complexity. In: Bioinformatics. Computational Biology, vol 21. Springer, London. https://doi.org/10.1007/978-1-4471-6702-0_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-6702-0_6

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-6701-3

  • Online ISBN: 978-1-4471-6702-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics