# A Combinatorial Problem Arising in Information Theory: Precise Minimax Redundancy for Markov Sources

• Philippe Jacquet
• Wojciech Szpankowski
Conference paper
Part of the Trends in Mathematics book series (TM)

## Abstract

Redundancy of a code is defined as the excess of the code length over the optimal code length. When the source of information is unknown, then one wants to design the best code for the worst source (within the class of sources that are being considered). This is called the minimax redundancy. It can come in two flavors: either on average or the worst case. The latter is known as the maximal minimax redundancy, and it is studied in this paper for Markovian sources. Surprisingly, this problem led us to an interesting combinatorial problem on directed graphs that we shall solve using analytic tools. To be more precise, we need to count the number of Eulerian cycles in a directed multi-graph. The maximal minimax redundancy turns out to be a sum over such Eulerian paths. In particular, we shall prove that the maximal minimax redundancy for Markov sources of order r is asymptotically equal to 1/2 mr(m-1) log n + log Am + O(1/n), where n is the length of source sequences, m is the size of the alphabet and Am is an explicit constant that depends on m.*

## Keywords

Frequency Count Code Length Finite Alphabet Minimax Regret Universal Code
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

## References

1. [1]
K. Atteson, The Asymptotic Redundancy of Bayes Rules for Markov Chains, IEEE Trans. on Information Theory, 45, 2104–2109, 1999.
2. [2]
A. Barron, J. Rissanen, and B. Yu, The Minimum Description Length Principle in Coding and Modeling, IEEE Trans. Information Theory, 44, 2743–2760, 1998.
3. [3]
P. Billingsley, Statistical Methods in Markov Chains, Ann. Math. Statistics, 32, 12–40, 1961.
4. [4]
L. Boza, Asymptotically Optimal Tests for Finite Markov Chains, Ann. Math. Statistics, 42, 1992–2007, 1971.
5. [5]
L. Campbell, A Coding Theorem and Rényi’s Entropy, Information and Control, 8, 423–429, 1965.
6. [6]
B. Clarke and A. Barron, Information-theoretic Asymptotics of Bayes Methods, IEEE Trans. Information Theory, 36, 453–471, 1990.
7. [7]
R. Corless, G. Gonnet, D. Hare, D. Jeffrey and D. Knuth, On the Lambert W Function, Adv. Computational Mathematics, 5, 329–359, 1996.
8. [8]
T. Cover and J.A. Thomas, Elements of Information Theory, John Wiley & Sons, New York 1991.
9. [9]
I. Csiszár and P. Shields, Redundancy Rates for Renewal and Other Processes, IEEE Trans. Information Theory, 42, 2065–2072, 1996.
10. [10]
A. Dembo and I Kontoyiannis, Critical Behvaior in Lossy Coding, IEEE Trans. Inform. Theory, March 2001.Google Scholar
11. [11]
M. Drmota and W. Szpankowski, Generalized Shannon Code Minimizes the Maximal Redundancy, Proc. LATIN 2002, Cancun, Mexico, 2002.Google Scholar
12. [12]
P. Flajolet and A. Odlyzko, Singularity Analysis of Generating Functions, SIAM J. Disc. Methods, 3, 216–240, 1990.
13. [13]
P. Flajolet and W. Szpankowski, Analytic Variations on Redundancy Rates of Renewal Processes, 2000 International Symposium on Information Theory, pp. 499, Sorento, Italy, June 2000; also INRIA RR No. 3553, 1998. 1995.Google Scholar
14. [14]
I. Kontoyiannis, Pointwise Redundancy in Lossy Data Compression and Universal Lossy Data sCcompression, IEEE Trans. Inform. Theory, 46, 136–152, 2000.
15. [15]
G. Louchard and W. Szpankowski, On the Average Redundancy Rate of the Lempel-Ziv Code, IEEE Trans. Information Theory, 43, 2–8, 1997.
16. [16]
J. Rissanen, Complexity of Strings in the Class of Markov Sources, IEEE Trans. Information Theory, 30, 526–532, 1984.
17. [17]
J. Rissanen, Universal Coding, Information, Prediction, and Estimation, IEEE Trans. Information Theory, 30, 629–636, 1984.
18. [18]
J. Rissanen, Fisher Information and Stochastic Complexity, IEEE Trans. Information Theory, 42, 40–47, 1996.
19. [19]
S. Savari, Redundancy of the Lempel-Ziv Incremental Parsing Rule, IEEE Trans. Information Theory, 43, 9–21, 1997.
20. [20]
P. Shields, Universal Redundancy Rates Do Not Exist, IEEE Trans. Information Theory, 39, 520–524, 1993.
21. [21]
Y. Shtarkov, Universal Sequential Coding of Single Messages, Problems of Information Transmission, 23, 175–186, 1987.
22. [22]
Y. Shtarkov, T. Tjalkens and F.M. Willems, Multi-alphabet Universal Coding of Memoryless Sources, Problems of Information Transmission, 31, 114–127, 1995.
23. [23]
W. Szpankowski, On Asymptotics of Certain Recurrences Arising in Universal Coding, Problems of Information Transmission, 34, 55–61, 1998.
24. [24]
W. Szpankowski, Asymptotic Redundancy of Huffman (and Other) Block Codes, IEEE Trans. Information Theory, 46, 2434–2443, 2000.
25. [25]
W. Szpankowski, Average Case Analysis of Algorithms on Sequences, Wiley, New York, 2001.
26. [26]
Q. Xie, A. Barron, Minimax Redundancy for the Class of Memoryless Sources, IEEE Trans. Information Theory, 43, 647–657, 1997.Google Scholar
27. [27]
Q. Xie, A. Barron, Asymptotic Minimax Regret for Data Compression, Gambling, and Prediction, IEEE Trans. Information Theory, 46, 431–445, 2000.
28. [28]
P. Whittle, Some Distribution and Moment Formulæ for Markov Chain, J. Roy. Stat. Soc., Ser. B., 17, 235–242, 1955.
29. [29]
A. J. Wyner, The Redundancy and Distribution of the Phrase Lengths of the Fixed-Database Lempel-Ziv Algorithm, IEEE Trans. Information Theory, 43, 1439–1465, 1997.