Markovian Models for Sequential Data

doi:10.1007/978-1-84800-007-0_10

Part of the book series: Advanced Information and Knowledge Processing ((AI&KP))

2564 Accesses
3 Citations

Most of the techniques presented in this book are aimed at making decisions about data. By data it is meant, in general, vectors representing, in some sense, real-world objects that cannot be handled directly by computers. The components of the vectors, the so-called features, are supposed to contain enough information to allow a correct decision and to distinguish between different objects (see Chapter 5). The algorithms are typically capable, after a training procedure, of associating input vectors with output decisions. On the other hand, in some cases real-world objects of interest cannot be represented with a single vector because they are sequential in nature. This is the case of speech and handwriting, which can be thought of as sequences of phonemes (see Chapter 2) and letters, respectively, temporal series, biological sequences (e.g. chains of proteins in DNA), natural language sentences, music, etc. The goal of this chapter is to show how some of the techniques presented so far for single vectors can be extended to sequential data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

P. Baldi and S. Brunak. Bioinformatics: The Machine Learning Approach. MIT Press, 2001.
Google Scholar
L.E. Baum. An inequality and associated maximization technique in statistical estimation for probabilistic functions of a Markov process. Inequalities, 3:1-8, 1972.
Google Scholar
L.E. Baum and J. Eagon. An inequality with applications to statistical predic-tion for functions of Markov processes and to a model of ecology. Bulletin of the American Mathematical Society, 73:360-363, 1967.
Article MATH MathSciNet Google Scholar
L.E. Baum, T. Petrie, G. Soules, and N. Weiss. A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Annals of Mathematical Statistics, 41:164-171, 1970.
Article MATH MathSciNet Google Scholar
R. Bellman and S. Dreyfus. Applied Dynamic Programming. Princeton Univer- sity Press, 1962.
Google Scholar
S. Bengio. An asynchronous hidden Markov model for audio-visual speech recog- nition. In Advances in Neural Information Processing Systems, pages 1237-1244, 2003.
Google Scholar
Y. Bengio. Markovian models for sequential data. Neural Computing Surveys, 2:129-162, 1999.
Google Scholar
Y. Bengio and P. Frasconi. An inout/output HMM architecture. In Advances in Neural Information Processing Systems, pages 427-434, 1995.
Google Scholar
J.A. Bilmes. A gentle tutorial of the EM algorithm and its application to para-meter estimation for Gaussian mixture and hidden markov models. Technical Report TR-97-021, International Computer Science Institute (ICSI), Berkeley, 1998.
Google Scholar
R. Bjar and S. Hamori. Hidden Markov Models: Applications to Financial Eco- nomics. Springer-Verlag, 2004.
Google Scholar
H. Bourlard and S. Bengio. Hidden Markov models and other finite state au-tomata for sequence processing. In M.A. Arbib, editor, The Handbook of Brain Theory and Neural Networks. 2002.
Google Scholar
H. Bourlard, Y. Konig, and N. Morgan. A training algorithm for statistical sequence recognition with applications to transition-based speech recognition. IEEE Signal Processing Letters, 3(7):203-205, 1996.
Article Google Scholar
H. Bourlard and N. Morgan. Connectionist Speech Recognition - A Hybrid Ap- proach. Kluwer Academic Publishers, 1993.
Google Scholar
S. Chen and R. Rosenfled. A survey of smoothing techniques for ME models. IEEE Transactions on Speech and Audio Processing, 8(1):37-50, 2000.
Article Google Scholar
P. Clarkson and R. Rosenfled. Statistical language modeling using the CMU-Cambridge Toolkit. In Proceedings of Eurospeech, pages 2707-2710, 1997.
Google Scholar
T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. MIT Press, 1990.
Google Scholar
R.J. Elliott, L. Aggoun, and J.B. Moore. Hidden Markov Models: Estimation and Control. Springer-Verlag, 1997.
Google Scholar
S. Godfeld and R. Quandt. A markov model for switching regressions. Journal of Econometrics, 1:3-16, 1973.
Article Google Scholar
I.J. Good. The population frequencies of species and the estimation of popula- tion parameters. Biometrika, 40(3-4):237-264, 1953.
Article MATH MathSciNet Google Scholar
J. Hamilton. A new approach of the economic analysis of non-stationary time series and the business cycle. Econometrica, 57:357-384, 1989.
Article MATH MathSciNet Google Scholar
H.S. Heaps. Information Retrieval - Computational and Theoretical Aspects. Academic Press, 1978.
Google Scholar
J. Hennebert, C. Ris, H. Bourlard, S. Renals, and N. Morgan. Estimation of global posteriors and forward-backward training of hybrid HMM-ANN systems. In Proceedings of Eurospeech, pages 1951-1954, 1997.
Google Scholar
J. Hopcroft, R. Motwani, and J. Ullman. Introduction to Automata Theory, Language and Computations. Addison Wesley, 2000.
Google Scholar
F. Jelinek. Statistical Aspects of Speech Recognition. MIT Press, 1997.
Google Scholar
R. Kalman and R. Bucy. New results in linear filtering and prediction. Journal of Basic Engineering, 83D:95-108, 1961.
MathSciNet Google Scholar
S. Katz. Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Transactions on Acoustics, Speech, and Signal Processing, 35(3):400-401, 1987.
Article Google Scholar
D. Klakow and J. Peters. Testing the correlation of word error rate and per-plexity. Speech Communication, 38(1):19-28, 2002.
Article MATH Google Scholar
T. Koski. Hidden Markov Models for Bioinformatics. Springer-Verlag, 2002.
Google Scholar
A. Markov. An example of statistical investigation in the text of Eugene Onyegin illustrating coupling of test in chains. Proceedings of the Academy of Sciences of St. Petersburg, 1913.
Google Scholar
G.J. McLachlan and T. Krishnan. The EM Algorithm and Extensions. John Wiley, 1997.
Google Scholar
H. Ney, S. Martin, and F. Wessel. Statistical language modeling. In S. Young and G. Bloothooft, editors, Corpus Based Methods in Language and Speech Processing, pages 174-207. Kluwer Academic Publishers, 1997.
Google Scholar
L. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. In A. Waibel and K.F. Lee, editors, Readings in Speech Recognition, pages 267-296. 1989.
Google Scholar
D. Ron, Y. Singer, and N. Tishby. The power of amnesia: learning probabilistic automata with variable memory length. Machine Learning, 25(2-3):117-149, 1996.
Article MATH Google Scholar
D. Ron, Y. Singer, and N. Tishby. Learning with probabilistic automata with variable memory length. In Proceedings of ACM Conference on Computational Learning Theory, pages 35-46, 1997.
Google Scholar
R. Rosenfeld. Two decades of Statistical Language Modeling: where do we go from here? Proceedings of IEEE, 88(8):1270-1278, 2000.
Article Google Scholar
R. Shumway and D. Stoffer. An approach to time series smoothing and fore-casting using the EM algorithm. Journal of Time Series Analysis, 4(3):253-264, 1982.
Article Google Scholar
E. Vidal, F. Thollard, C. de la Higuera, F. Casacuberta, and R.C. Carrasco. Probabilistic finite state machines - Part I. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(7):1013-1025, 2005.
Article Google Scholar
E. Vidal, F. Thollard, C. de la Higuera, F. Casacuberta, and R.C. Carrasco. Probabilistic finite state machines - Part II. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(7):1026-1039, 2005.
Article Google Scholar
A.J. Viterbi. Error bounds for convolutional codes and an asymptotically op-timal decoding algorithm. IEEE Transactions on Information Theory, 13:260-269,1967.
Article MATH Google Scholar
L. Xu and M.J. Jordan. On convergence properties of the EM algorithm for Gaussian Mixtures. Neural Computation, 8:129-151, 1996.
Article Google Scholar
G. Zipf. Human Behaviour and the Principle of Least Effort. Addison-Wesley, 1949.
Google Scholar

Download references

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

(2008). Markovian Models for Sequential Data. In: Machine Learning for Audio, Image and Video Analysis. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-84800-007-0_10

Download citation

DOI: https://doi.org/10.1007/978-1-84800-007-0_10
Publisher Name: Springer, London
Print ISBN: 978-1-84800-006-3
Online ISBN: 978-1-84800-007-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics