Learning of Variability for Invariant Statistical Pattern Recognition

Keysers, Daniel; Macherey, Wolfgang; Dahmen, Jörg; Ney, Hermann

doi:10.1007/3-540-44795-4_23

Daniel Keysers³,
Wolfgang Macherey³,
Jörg Dahmen³ &
…
Hermann Ney³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2167))

Included in the following conference series:

European Conference on Machine Learning

2315 Accesses
11 Citations

Abstract

In many applications, modelling techniques are necessary which take into account the inherent variability of given data. In this paper, we present an approach to model class specific pattern variation based on tangent distance within a statistical framework for classification. The model is an effective means to explicitly incorporate invariance with respect to transformations that do not change class-membership like e.g. small affine transformations in the case of image objects. If no prior knowledge about the type of variability is available, it is desirable to learn the model parameters from the data. The probabilistic interpretation presented here allows us to view learning of the variational derivatives in terms of a maximum likelihood estimation problem. We present experimental results from two different real-world pattern recognition tasks, namely image object recognition and automatic speech recognition. On the US Postal Service handwritten digit recognition task, learning of variability achieves results well comparable to those obtained using specific domain knowledge. On the SieTill corpus for continuously spoken telephone line recorded German digit strings the method shows a significant improvement in comparison with a common mixture density approach using a comparable amount of parameters. The probabilistic model is well-suited to be used in the field of statistical pattern recognition and can be extended to other domains like cluster analysis.

Download to read the full chapter text

Chapter PDF

Variational Bayesian Approximation Method for Classification and Clustering with a Mixture of Student-t Model

Pattern Learning and Recognition on Statistical Manifolds: An Information-Geometric Review

Estimation of Single-Gaussian and Gaussian Mixture Models for Pattern Recognition

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

C. M. Bishop. Bayesian PCA. In M. Kearns, S. Solla, and D. Cohn, editors, Advances in Neural Information Processing Systems 11. MIT Press, pages 332–388, 1999.
Google Scholar
J. Dahmen, D. Keysers, H. Ney, and M. O. Guld. Statistical Image Object Recognition using Mixture Densities. Journal of Mathematical Imaging and Vision, 14(3):285–296, May 2001.
Article MATH Google Scholar
R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. John Wiley & Sons, Inc., New York, 2nd edition, 2000.
Google Scholar
T. Eisele, R. Haeb-Umbach, and D. Langmann. A comparative study of linear feature transformation techniques for automatic speech recognition. In Proc. of Int. Conf. on Spoken Language Processing, volume I, Philadelphia, PA, pages 252–255, Oct. 1996.
Google Scholar
K. Fukunaga. Introduction to Statistical Pattern Recognition. Computer Science and Scientific Computing Academic Press Inc., San Diego, CA, 2nd edition, 1990.
MATH Google Scholar
T. Hastie and P. Simard. Metrics and Models for Handwritten Character Recognition. Statistical Science, 13(1):54–65, January 1998.
Article MATH Google Scholar
T. Hastie and R. Tibshirani. Discriminative Adaptive Nearest Neighbor Classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(6):607–616, June 1996.
Article Google Scholar
D. Keysers, J. Dahmen, and H. Ney. A Probabilistic View on Tangent Distance. In 22. DAGM Symposium Mustererkennung 2000, Springer, Kiel, Germany, pages 107–114, September 2000.
Google Scholar
D. Keysers, J. Dahmen, T. Theiner, and H. Ney. Experiments with an Extended Tangent Distance. In Proceedings 15th International Conference on Pattern Recognition, volume 2, Barcelona, Spain, pages 38–42, September 2000.
Google Scholar
P. Meinicke and H. Ritter. Local PCA Learning with Resolution-Dependent Mixtures of Gaussians. In Proc. of ICANN’99, 9th Intl. Conf. on Artificial Neural Networks, Edinburgh, UK, pages 497–502, September 1999.
Google Scholar
B. Moghaddam and A. Pentland. Probabilistic Visual Learning for Object Representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):696–710, July 1997.
Article Google Scholar
T. R. Payne and P. Edwards. Dimensionality Reduction through Sub-space Mapping for Nearest Neighbor Algorithms. In Proceedings ECML 2000, 11th European Conference on Machine Learning, volume 1810 of Lecture Notes in Artificial Intelligence, Springer, Barcelona, Spain, pages 331–343, May 2000.
Chapter Google Scholar
B. Scholkopf, P. Simard, A. Smola, and V. Vapnik. Prior Knowledge in Support Vector Kernels. In M. I. Jordan, M. J. Kearns, and S. A. Solla, editors, Advances in Neural Inf. Proc. Systems, volume 10. MIT Press, pages 640–646, 1998.
Google Scholar
P. Simard, Y. Le Cun, J. Denker, and B. Victorri. Transformation Invariance in Pattern Recognition — Tangent Distance and Tangent Propagation. In G. Orr and K.-R. Muller, editors, Neural networks: tricks of the trade, volume 1524 of Lecture Notes in Computer Science, Springer, Heidelberg, pages 239–274, 1998.
Chapter Google Scholar
P. Simard, Y. Le Cun, and J. Denker. Efficient Pattern Recognition Using a New Transformation Distance. In S. Hanson, J. Cowan, and C. Giles, editors, Advances in Neural Inf. Proc. Systems, volume 5, Morgan Kaufmann, San Mateo CA, pages 50–58, 1993.
Google Scholar
P. Simard, Y. Le Cun, J. Denker, and B. Victorri. An Efficient Algorithm for Learning Invariances in Adaptive Classifiers. In Proceedings 11th International Conference on Pattern Recognition, The Hague, The Netherlands, pages 651–655, August 1992.
Google Scholar
M. E. Tipping. The Relevance Vector Machine. In S. Solla, T. Leen, and K. Muller, editors, Advances in Neural Information Processing Systems 12. MIT Press, pages 332–388, 2000.
Google Scholar
L. Welling, H. Ney, A. Eiden, and C. Forbrig. Connected Digit Recognition using Statistical Template Matching. In 1995 Europ. Conf. on Speech Communication and Technology, volume 2, Madrid, Spain, pages 1483–1486, Sept. 1995.
Google Scholar
J. Wood. Invariant Pattern Recognition: A Review. Pattern Recognition, 29(1):1–17, January 1996.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Lehrstuhl für Informatik VI, Computer Science Department, RWTH Aachen - University of Technology, D-52056, Aachen, Germany
Daniel Keysers, Wolfgang Macherey, Jörg Dahmen & Hermann Ney

Authors

Daniel Keysers
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang Macherey
View author publications
You can also search for this author in PubMed Google Scholar
Jörg Dahmen
View author publications
You can also search for this author in PubMed Google Scholar
Hermann Ney
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Albert-Ludwigs University Freiburg, Georges Köhler-Allee, Geb. 079, 79110, Freiburg, Germany
Luc De Raedt
Department of Computer Science, University of Bristol, Merchant Ventures Bldg., Woodland Road, Bristol, BS8 1UB, UK
Peter Flach

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Keysers, D., Macherey, W., Dahmen, J., Ney, H. (2001). Learning of Variability for Invariant Statistical Pattern Recognition. In: De Raedt, L., Flach, P. (eds) Machine Learning: ECML 2001. ECML 2001. Lecture Notes in Computer Science(), vol 2167. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44795-4_23

Download citation

DOI: https://doi.org/10.1007/3-540-44795-4_23
Published: 30 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42536-6
Online ISBN: 978-3-540-44795-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Learning of Variability for Invariant Statistical Pattern Recognition

Abstract

Chapter PDF

Similar content being viewed by others

Variational Bayesian Approximation Method for Classification and Clustering with a Mixture of Student-t Model

Pattern Learning and Recognition on Statistical Manifolds: An Information-Geometric Review

Estimation of Single-Gaussian and Gaussian Mixture Models for Pattern Recognition

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Learning of Variability for Invariant Statistical Pattern Recognition

Abstract

Chapter PDF

Similar content being viewed by others

Variational Bayesian Approximation Method for Classification and Clustering with a Mixture of Student-t Model

Pattern Learning and Recognition on Statistical Manifolds: An Information-Geometric Review

Estimation of Single-Gaussian and Gaussian Mixture Models for Pattern Recognition

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation