Abstract
What do we mean when by saying that a given system shows “complex behavior”, can we provide precise measures for the degree of complexity? This chapter offers an account of several common measures of complexity and the relation of complexity to predictability and emergence.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In some areas, like the neurosciences or artificial intelligence, the term “Bayesian” is used for approaches using statistical methods, in particular in the context of hypothesis building, when estimates of probability distribution functions are derived from observations.
- 2.
The expression \(p(x_i)\) is therefore context specific and can denote both a properly normalized discrete distribution function as well as the value of a continuous probability distribution function.
- 3.
In formal texts on statistics and information theory the notation \(\mu=E(X)\) is often used for the mean μ, the expectation value \(E(X)\) and a random variable X, where X represents the abstract random variable, whereas x denotes its particular value and \(p_X(x)\) the probability distribution.
- 4.
Please take note of the difference between a cumulative stochastic process, when adding the results of individual trials, and the “cumulative PDF” \(F(x)\) defined by \(F(x)=\int_{-\infty}^x p(x')dx'\).
- 5.
For continuous-time data, as for an electrocardiogram, an additional symbolization step is necessary, the discretization of time. Here we consider however only discrete-time series.
- 6.
Remember, that \(\textrm{XOR}(0,0)=0=\textrm{XOR}(1,1)\) and \(\textrm{XOR}(0,1)=1=\textrm{XOR}(1,0)\).
- 7.
A function \(f(x)\) is a function of a variable x; a functional \(F[f]\) is, on the other hand, functionally dependent on a function \(f(x)\). In formal texts on information theory the notation \(H(X)\) is often used for the Shannon entropy and a random variable X with probability distribution \(p_X(x)\).
- 8.
For a proof consider the generic substitution \(x\to q(x)\) and a transformation of variables \(x\to q\) via \(dx=dq/q'\), with \(q'=dq(x)/dx\), for the integration in Eq. (3.43).
Further Reading
We recommend for further readings introductions to information theory (Cover and Thomas, 2006), to Bayesian statistics (Bolstad, 2004), to complex system theory in general (Boccara, 2003), and to algorithmic complexity (Li and Vitanyi, 1997)
For further studies we recommend several review articles, on evolutionary development of complexity in organisms (Adami, 2002), on complexity and predictability (Boetta et al., 2003), a critical assessement of various complexity measures (Olbrich et al., 2008) and a thoughtful discussion on various approaches to the notion of complexity (Manson, 2001).
For some further, somewhat more specialized topics, we recommend Binder (2008) for a perspective on the interplay between dynamical frustration and complexity, Binder (2009) for the question of decidability in complex systems, and Tononi and Edelman (1998) on possible interrelations between consciousness and complexity.
Adami, C. 2002 What is complexity? BioEssays 24, 1085–1094.
Binder, P.-M. 2008 Frustration in complexity. Science 320, 322–323.
Binder, P.-M. 2009 The edge of reductionism. Nature 459, 332–334.
Boccara, N. 2003 Modeling Complex Systems. Springer, Berlin.
Boetta, G., Cencini, M., Falcioni, M., Vulpiani, A. 2002 Predictability: A way to characterize complexity. Physics Reports 356, 367–474.
Bolstad, W.M. 2004 Introduction to Bayesian Statistics. Wiley-IEEE, Hoboken, NJ.
Cover, T.M., Thomas, J.A. 2006 Elements of Information Theory. Wiley-Interscience, Hoboken, NJ.
Li, M., Vitanyi, P.M.B. 1997 An introduction to Kolmogorov Complexity and its Applications. Springer, Berlin.
Manson, S.M. 2001 Simplifying complexity: A review of complexity theory. Geoforum 32, 405–414.
Olbrich, E., Bertschinger, N., Ay, N., Jost, J. 2008 How should complexity scale with system size? The European Physical Journal B 63, 407–415.
Tononi, G., Edelman, G.M. 1998 Consciousness and complexity. Science 282, 1846.
Author information
Authors and Affiliations
Corresponding author
Exercises
Exercises
3.1.1 The Law of Large Numbers
Generalize the derivation for the law of large numbers given in Sect. 3.1.1 for the case of \(i=1,\dots,N\) independent discrete stochastic processes \(p^{(i)}_k\), described by their respective generating functionals \(G_i(x)=\sum_k p^{(i)}_k x^k\).
3.1.2 Symbolization of Financial Data
Generalize the symbolization procedure defined for the joint probabilities \(p_{\pm\pm}\) defined by Eq. (3.15) to joint probabilities \(p_{\pm\pm\pm}\). E.g. \(p_{+++}\) would measure the probability of three consecutive increases. Download from the Internet the historical data for your favorite financial asset, like the Dow Jones or the Nasdaq stock indices, and analyze it with this symbolization procedure. Discuss, whether it would be possible, as a matter of principle, to develop in this way a money-making scheme.
3.1.3 The OR Time Series with Noise
Consider the time series generated by a logical OR, akin to Eq. (3.16). Evaluate the probability \(p(1)\) for finding a 1, with and without averaging over initial conditions, both without and in presence of noise. Discuss the result.
3.1.4 Maximal Entropy Distribution Function
Determine the probability distribution function \(p(x)\), having a given mean μ and a given variance σ 2, compare Eq. (3.32), which maximizes the Shannon entropy.
3.1.5 Two-Channel Markov Process
Consider, in analogy to Eq. (3.34) the two-channel Markov process \(\{\sigma_t,\tau_t\}\),
Evaluate the joint and marginal distribution functions, the respective entropies and the resulting mutual information. Discuss the result as a function of noise strength α.
3.1.6 Kullback-Leibler Divergence
Try to approximate an exponential distribution function by a scale-invariant PDF, considering the Kullback-Leibler divergence \(K[p;q]\), Eq. (3.45), for the two normalized PDFs
Which exponent γ minimizes \(K[p;q]\)? How many times do the graphs for \(p(x)\) and \(q(x)\) cross?
3.1.7 Chi-Squared Test
The quantity
measures the similarity of two normalized probability distribution functions p i and q i . Show, that the Kullback-Leibler divergence \(K[p;q]\), Eq. (3.45), reduces to \(\chi^2[p;q]/2\) if the two distributions are quite similar.
3.1.8 Excess Entropy
Use the representation
to prove that \(E\ge0\), compare Eqs. (3.51) and (3.53), as long as \(H[p_n]\) is concave as a function of n.
3.1.9 Tsallis Entropy
The “Tsallis Entropy”
of a probability distribution function p is a popular non-extensive generalization of the Shannon entropy \(H[p]\). Prove that
and the non-extensiveness
for two statistically independent systems X and Y. For which distribution function p is \(H_q[p]\) maximal?
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Gros, C. (2011). Complexity and Information Theory. In: Complex and Adaptive Dynamical Systems. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04706-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-04706-0_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04705-3
Online ISBN: 978-3-642-04706-0
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)