Hidden Markov Models

  • Saeed V. Vaseghi


Hidden Markov models (HMMs), are used for statistical modelling of nonstationary stochastic processes such as speech and time-varying noise. An HMM models the time-variations of the statistics of a random process, with a Markovian chain of state-dependent stationary sub-processes. An HMM is essentially a Bayesian finite state process, with a Markovian prior for modelling the transitions between the states, and a set of state pdfs for the modelling of the random variations of the stochastic process within each state.

This chapter begins with a brief introduction to continuous and finite-state nonstationary models, before concentrating on the theory and applications of hidden Markov models. We study the Baum-Welch method for the maximum likelihood training of the parameters of an HMM, and then consider the use of HMMs and the Viterbi decoding algorithm for the classification and decoding of an unlabelled observation sequence. Finally the application of HMMs in signal enhancement is considered.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Bahl L.R., Brown P.F., de Souza P.V., Mercer R.L. (1986), Maximum Mutual Information Estimation of Hidden Markov Model Parameters for Speech Recognition, IEEE Proc. Acoustics, Speech and Signal Processing, ICASSP-86 Tokyo, Pages 40–43.Google Scholar
  2. Bahl L.R., Brown P.F. de Souza P.V., Mercer R.L. (1989), Speech Recognition with Continuous Parameter Hidden Markov Models, IEEE Proc. Acoustics, Speech and Signal Processing, ICASSP-88 New York Pages 40–43.Google Scholar
  3. Bahl L.R., Jelinek F, Mercer R.L. (1983), A Maximum Likelihood Approach to Continuous Speech Recognition, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 5, Pages 179–190.CrossRefGoogle Scholar
  4. Baum L.E., Eagon J.E. (1967), An Inequality with Applications to Statistical Estimation for Probabilistic Functions of a Markov Process and to Models for Ecology, Bull. AMS, Vol. 73 Pages 360–363.MathSciNetzbMATHCrossRefGoogle Scholar
  5. Baum L.E., Petrie T. (1966), Statistical Inference for Probabilistic Functions of Finite State Markov Chains. Ann. Math. Stat. 37, Pages 1554–1563.MathSciNetzbMATHCrossRefGoogle Scholar
  6. Baum L.E., Petrie T., Soules G., Weiss N.(1970), A Maximisation Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains, Ann. Math. Stat., Vol. 41, Pages 164–171.MathSciNetzbMATHCrossRefGoogle Scholar
  7. Conner P. N. (1993), Hidden Markov Model with Improved Observation and Duration Modelling, Ph.D. Thesis, University of East Anglia, England.Google Scholar
  8. Epharaim Y., Malah D., Juang B.H.(1989), On Application of Hidden Markov Models for Enhancing Noisy Speech., IEEE Trans. Acoustics Speech and Signal Processing, Vol. 37(12), Pages 1846–1856, Dec.CrossRefGoogle Scholar
  9. Forney G.D. (1973), The Viterbi Algorithm, Proc. IEEE, Vol. 61, Pages 268–278.MathSciNetCrossRefGoogle Scholar
  10. Gales M.J.F., Young S.J. (1992), An Improved Approach to the Hidden Markov Model Decomposition of Speech and Noise, in Proc. IEEE Int. Conf. on Acoustics., Speech, Signal Processing, ICASSP-92, Pages 233–236.Google Scholar
  11. Gales M.J.F., Young S.J. (1993), HMM Recognition in Noise using Parallel Model Combination, EUROSPEECH-93, Pages 837–840.Google Scholar
  12. Huang X.D., Ariki Y., Jack M.A. (1990), Hidden Markov Models for Speech Recognition, Edinburgh University Press, Edinburgh.Google Scholar
  13. Huang X.D., Jack M.A. (1989), Unified Techniques for Vector Quantisation and Hidden Markov Modelling using Semi-Continuous Models, IEEE Proc. Acoustics, Speech and Signal Processing, ICASSP-89 Glasgow, Pages 639–642.CrossRefGoogle Scholar
  14. Jelinek F, Mercer R (1980), Interpolated Estimation of Markov Source Parameters from Sparse Data, Proc. of the Workshop on Pattern Recognition in Practice. North-Holland, Amesterdam.Google Scholar
  15. Jelinek F, (1976), Continuous Speech Recognition by Statistical Methods, Proc. of IEEE, Vol. 64, Pages 532–556.CrossRefGoogle Scholar
  16. Juang B.H. (1985), Maximum-Likelihood Estimation for Mixture Multi-Variate Stochastic Observations of Markov Chain, AT&T Bell laboratories Tech J., Vol. 64, Pages 1235–1249.MathSciNetzbMATHGoogle Scholar
  17. Juang B.H. (1984), On the Hidden Markov Model and Dynamic Time Warping for Speech Recognition- A unified Overview, AT&T Technical Journal, Vol. 63 Pages 1213–1243.MathSciNetzbMATHGoogle Scholar
  18. Kullback S., and Leibler R. A. (1951), On Information and Sufficiency, Ann. Math. Stat. Vol. 22 Pages 79–86.MathSciNetzbMATHCrossRefGoogle Scholar
  19. Lee K.F. (1989), Automatic Speech Recognition: the Development of SPHINX System, MA: Kluwer Academic Publishers, Boston.Google Scholar
  20. Lee K.F. (1989), Hidden Markov Model: Past, Present and Future, Eurospeech 89, Paris..Google Scholar
  21. Liporace L.R. (1982), Maximum Likelihood Estimation for Multi-Variate Observations of Markov Sources, IEEE Trans. IT, Vol. IT-28 Pages 729–734.MathSciNetCrossRefGoogle Scholar
  22. Markov A. A. (1913), An Example of Statistical Investigation in the text of Eugen Onyegin Illustrating Coupling of Tests in Chains, Proc. Acad. Sci. St Petersburg VI Ser., Vol. 7, Pages 153–162.Google Scholar
  23. Milner B. P. (1995), Speech Recognition in Adverse Environments, Ph.D. Thesis, University of East Anglia, England.Google Scholar
  24. Peterie T. (1969), Probabilistic Functions of Finite State Markov Chains, Ann. Math. Stat., Vol. 40, Pages 97–115.CrossRefGoogle Scholar
  25. Rabiner L. R., Juang B. H. (1986), An Introduction to Hidden Markov Models, IEEE ASSP Magazine, Pages 4–16.Google Scholar
  26. Rabiner L. R., Juang B. H., Levinson S. E., Sondhi M. M., (1985), Recognition of Isolated Digits using Hidden Markov Models with Continuous Mixture Densities, AT&T Technical Journal, Vol. 64, Pages 1211–1234.MathSciNetGoogle Scholar
  27. Rabiner L. R., Juang B. H. (1993), Fundamentals of Speech Recognition, Prentice-Hall, Englewood Cliffs, N. J.Google Scholar
  28. Young S.J., HTK: Hidden Markov Model Tool Kit, Cambridge University Engineering Department, Cambridge.Google Scholar
  29. Varga A, Moore R.K., Hidden Markov Model Decomposition of Speech and Noise, in Proc. IEEE Int., Conf. on Acoust., Speech, Signal Processing, 1990, Pages. 845–848Google Scholar
  30. Viterbi A.J. (1967), Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm, IEEE Trans. on Information theory, Vol. IT-13 Pages 260–269.CrossRefGoogle Scholar

Copyright information

© John Wiley & Sons Ltd. and B.G. Teubner 1996

Authors and Affiliations

  • Saeed V. Vaseghi
    • 1
  1. 1.Queen’s UniversityBelfastUK

Personalised recommendations