Part of the Information Science and Statistics book series (ISS)
For a string (xn), generated by sampling a probability distribution P(xn), we have already suggested the ideal code length — logP(xn) to serve as its complexity, the Shannon complexity, with the justification that its mean is for large alphabets a tight lower bound for the mean prefix code length. The problem, of course, arises that this measure of complexity depends very strongly on the distribution P, which in the cases of interest to us is not given. Nevertheless, we feel intuitively that a measure of complexity ought to be linked with the ease of its description. For instance, consider the following three types of data strings of length n = 20, where the length actually ought to be taken large to make our point:
generate a string by flipping a coin 20 times
KeywordsBinary String Recursive Function Code Length Kolmogorov Complexity Regular Feature
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Unable to display preview. Download preview PDF.
© Springer Science+Business Media, LLC 2007