Universal Coding and Order Identification by Model Selection Methods

  • Élisabeth Gassiat

Part of the Springer Monographs in Mathematics book series (SMM)

Table of contents

  1. Front Matter
    Pages i-xv
  2. Élisabeth Gassiat
    Pages 1-27
  3. Élisabeth Gassiat
    Pages 29-74
  4. Élisabeth Gassiat
    Pages 75-101
  5. Élisabeth Gassiat
    Pages 103-144
  6. Back Matter
    Pages 145-146

About this book


The purpose of these notes is to highlight the far-reaching connections between Information Theory and Statistics. Universal coding and adaptive compression are indeed closely related to statistical inference concerning processes and using maximum likelihood or Bayesian methods. The book is divided into four chapters, the first of which introduces readers to lossless coding, provides an intrinsic lower bound on the codeword length in terms of Shannon’s entropy, and presents some coding methods that can achieve this lower bound, provided the source distribution is known. In turn, Chapter 2 addresses universal coding on finite alphabets, and seeks to find coding procedures that can achieve the optimal compression rate, regardless of the source distribution. It also quantifies the speed of convergence of the compression rate to the source entropy rate. These powerful results do not extend to infinite alphabets. In Chapter 3, it is shown that there are no universal codes over the class of stationary ergodic sources over a countable alphabet. This negative result prompts at least two different approaches: the introduction of smaller sub-classes of sources known as envelope classes, over which adaptive coding may be feasible, and the redefinition of the performance criterion by focusing on compressing the message pattern. Finally, Chapter 4 deals with the question of order identification in statistics. This question belongs to the class of model selection problems and arises in various practical situations in which the goal is to identify an integer characterizing the model: the length of dependency for a Markov chain, number of hidden states for a hidden Markov chain, and number of populations for a population mixture. The coding ideas and techniques developed in previous chapters allow us to obtain new results in this area. 
This book is accessible to anyone with a graduate level in Mathematics, and will appeal to information theoreticians and mathematical statisticians alike. Except for Chapter 4, all proofs are detailed and all tools needed to understand the text are reviewed.


68P30, 62C10 Universal Coding Adaptive Compression Hidden Markov Chains Model Selection Infinite Alphabets

Authors and affiliations

  • Élisabeth Gassiat
    • 1
  1. 1.Laboratoire de MathématiquesUniversité Paris-SudOrsay CedexFrance

Bibliographic information

  • DOI
  • Copyright Information Springer International Publishing AG, part of Springer Nature 2018
  • Publisher Name Springer, Cham
  • eBook Packages Mathematics and Statistics
  • Print ISBN 978-3-319-96261-0
  • Online ISBN 978-3-319-96262-7
  • Series Print ISSN 1439-7382
  • Series Online ISSN 2196-9922
  • Buy this book on publisher's site
Industry Sectors
IT & Software
Consumer Packaged Goods