Skip to main content

Pattern Learning and Recognition on Statistical Manifolds: An Information-Geometric Review

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7953))

Abstract

We review the information-geometric framework for statistical pattern recognition: First, we explain the role of statistical similarity measures and distances in fundamental statistical pattern recognition problems. We then concisely review the main statistical distances and report a novel versatile family of divergences. Depending on their intrinsic complexity, the statistical patterns are learned by either atomic parametric distributions, semi-parametric finite mixtures, or non-parametric kernel density distributions. Those statistical patterns are interpreted and handled geometrically in statistical manifolds either as single points, weighted sparse point sets or non-weighted dense point sets. We explain the construction of the two prominent families of statistical manifolds: The Rao Riemannian manifolds with geodesic metric distances, and the Amari-Chentsov manifolds with dual asymmetric non-metric divergences. For the latter manifolds, when considering atomic distributions from the same exponential families (including the ubiquitous Gaussian and multinomial families), we end up with dually flat exponential family manifolds that play a crucial role in many applications. We compare the advantages and disadvantages of these two approaches from the algorithmic point of view. Finally, we conclude with further perspectives on how “geometric thinking” may spur novel pattern modeling and processing paradigms.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Jain, A.K., Duin, R.P.W., Mao, J.: Statistical pattern recognition: A review. IEEE Trans. Pattern Anal. Mach. Intell. 22, 4–37

    Google Scholar 

  2. Cramér, H.: Mathematical Methods of Statistics. Princeton Landmarks in mathematics (1946)

    Google Scholar 

  3. Fréchet, M.: Sur l’extension de certaines évaluations statistiques au cas de petits échantillons. Review of the International Statistical Institute 11, 182–205 (1939) (published in IHP Lecture)

    Google Scholar 

  4. Rao, C.R.: Information and the accuracy attainable in the estimation of statistical parameters. Bulletin of the Calcutta Mathematical Society 37, 81–89

    Google Scholar 

  5. Nielsen, F.: In : Connected at Infinity II: A selection of mathematics by Indians. Cramér-Rao lower bound and information geometry (Hindustan Book Agency (Texts and Readings in Mathematics, TRIM)) arxiv 1301.3578

    Google Scholar 

  6. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B (Methodological) 39, 1–38

    Google Scholar 

  7. Fukunaga, K.: Introduction to statistical pattern recognition, 2nd edn. Academic Press Professional, Inc. (1990); (1st edn. 1972)

    Google Scholar 

  8. Piro, P., Nielsen, F., Barlaud, M.: Tailored Bregman ball trees for effective nearest neighbors. In: European Workshop on Computational Geometry (EuroCG), LORIA, Nancy, France. IEEE (2009)

    Google Scholar 

  9. Nielsen, F., Piro, P., Barlaud, M.: Bregman vantage point trees for efficient nearest neighbor queries. In: Proceedings of the 2009 IEEE International Conference on Multimedia and Expo (ICME), pp. 878–881 (2009)

    Google Scholar 

  10. Nock, R., Nielsen, F.: Fitting the smallest enclosing bregman balls. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 649–656. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  11. Nielsen, F., Nock, R.: On the smallest enclosing information disk. Inf. Process. Lett. 105, 93–97

    Google Scholar 

  12. Nielsen, F., Nock, R.: On approximating the smallest enclosing Bregman balls. In: ACM Symposium on Computational Geometry (SoCG). ACM Press (2006)

    Google Scholar 

  13. Arnaudon, M., Nielsen, F.: On approximating the Riemannian 1-center. Computational Geometry 46, 93–104

    Google Scholar 

  14. Nielsen, F., Nock, R.: Approximating smallest enclosing balls with applications to machine learning. Int. J. Comput. Geometry Appl. 19, 389–414

    Google Scholar 

  15. Ali, S.M., Silvey, S.D.: A general class of coefficients of divergence of one distribution from another. Journal of the Royal Statistical Society, Series B 28, 131–142

    Google Scholar 

  16. Csiszár, I.: Information-type measures of difference of probability distributions and indirect observation. Studia Scientiarum Mathematicarum Hungarica 2, 229–318

    Google Scholar 

  17. Cover, T.M., Thomas, J.A.: Elements of information theory. Wiley Interscience, New York (1991)

    Book  MATH  Google Scholar 

  18. Nielsen, F.: Closed-form information-theoretic divergences for statistical mixtures. In: International Conference on Pattern Recognition, ICPR (2012)

    Google Scholar 

  19. Wu, J., Rehg, J.M.: Beyond the Euclidean distance: Creating effective visual codebooks using the histogram intersection kernel. In: ICCV (2009)

    Google Scholar 

  20. Nielsen, F., Garcia, V.: Statistical exponential families: A digest with flash cards. arXiv.org:0911.4863 (2009)

    Google Scholar 

  21. Hellman, M.E., Raviv, J.: Probability of error, equivocation and the Chernoff bound. IEEE Transactions on Information Theory 16, 368–372

    Google Scholar 

  22. Nielsen, F., Boltz, S.: The Burbea-Rao and Bhattacharyya centroids. IEEE Transactions on Information Theory 57, 5455–5466

    Google Scholar 

  23. Amari, S., Nagaoka, H.: Methods of Information Geometry. Oxford University Press (2000)

    Google Scholar 

  24. Qiao, Y., Minematsu, N.: A study on invariance of f-divergence and its application to speech recognition. Transactions on Signal Processing 58, 3884–3890

    Google Scholar 

  25. Pardo, M.C., Vajda, I.: About distances of discrete distributions satisfying the data processing theorem of information theory. IEEE Transactions on Information Theory 43, 1288–1293

    Google Scholar 

  26. Amari, S.: Alpha-divergence is unique, belonging to both f-divergence and Bregman divergence classes. IEEE Transactions on Information Theory 55, 4925–4931

    Google Scholar 

  27. Morozova, E.A., Chentsov, N.N.: Markov invariant geometry on manifolds of states. Journal of Mathematical Sciences 56, 2648–2669

    Google Scholar 

  28. Fisher, R.A.: On the mathematical foundations of theoretical statistics. Philosophical Transactions of the Royal Society of London A 222, 309–368

    Google Scholar 

  29. Chentsov, N.N.: Statistical Decision Rules and Optimal Inferences. Transactions of Mathematics Monograph, numero 53 (1982) (published in Russian in 1972)

    Google Scholar 

  30. Peter, A., Rangarajan, A.: A new closed-form information metric for shape analysis, vol. 1, pp. 249–256

    Google Scholar 

  31. Atkinson, C., Mitchell, A.F.S.: Rao’s distance measure. Sankhya A 43, 345–365

    Google Scholar 

  32. Lovric, M., Min-Oo, M., Ruh, E.A.: Multivariate normal distributions parametrized as a Riemannian symmetric space. Journal of Multivariate Analysis 74, 36–48

    Google Scholar 

  33. Schwander, O., Nielsen, F.: Model centroids for the simplification of kernel density estimators. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 737–740

    Google Scholar 

  34. Arnaudon, M., Nielsen, F.: Medians and means in Finsler geometry. CoRR abs/1011.6076 (2010)

    Google Scholar 

  35. Nielsen, F., Nock, R.: Hyperbolic Voronoi diagrams made easy, vol. 1, pp. 74–80. IEEE Computer Society, Los Alamitos

    Google Scholar 

  36. Nielsen, F., Nock, R.: The hyperbolic voronoi diagram in arbitrary dimension. CoRR abs/1210.8234 (2012)

    Google Scholar 

  37. Pennec, X.: Statistical computing on manifolds: From riemannian geometry to computational anatomy. In: Nielsen, F. (ed.) ETVC 2008. LNCS, vol. 5416, pp. 347–386. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  38. Banerjee, A., Merugu, S., Dhillon, I.S., Ghosh, J.: Clustering with Bregman divergences. Journal of Machine Learning Research 6, 1705–1749

    Google Scholar 

  39. Barndorff-Nielsen, O.E.: Information and exponential families: In statistical theory. Wiley series in probability and mathematical statistics: Tracts on probability and statistics. Wiley (1978)

    Google Scholar 

  40. Bogdan, K., Bogdan, M.: On existence of maximum likelihood estimators in exponential families. Statistics 34, 137–149

    Google Scholar 

  41. Nielsen, F.: k-MLE: A fast algorithm for learning statistical mixture models. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). IEEE (2012) (preliminary, technical report on arXiv)

    Google Scholar 

  42. Schwander, O., Nielsen, F., Schutz, A., Berthoumieu, Y.: k-MLE for mixtures of generalized Gaussians. In: International Conference on Pattern Recognition, ICPR (2012)

    Google Scholar 

  43. Schwander, O., Nielsen, F.: Fast learning of Gamma mixture models with k-MLE. In: Hancock, E., Pelillo, M. (eds.) SIMBAD 2013. LNCS, vol. 7953, pp. 235–249. Springer, Heidelberg (2013)

    Google Scholar 

  44. Saint-Jean, C., Nielsen, F.: A new implementation of k-MLE for mixture modelling of Wishart distributions. In: Geometric Sciences of Information, GSI (2013)

    Google Scholar 

  45. Schwander, O., Nielsen, F.: Learning Mixtures by Simplifying Kernel Density Estimators. In: Bhatia, Nielsen (eds.) Matrix Information Geometry, pp. 403–426

    Google Scholar 

  46. Nielsen, F., Nock, R.: Sided and symmetrized Bregman centroids. IEEE Transactions on Information Theory 55, 2882–2904

    Google Scholar 

  47. Garcia, V., Nielsen, F., Nock, R.: Levels of details for Gaussian mixture models, vol. 2, pp. 514–525

    Google Scholar 

  48. Vemuri, B., Liu, M., Amari, S., Nielsen, F.: Total Bregman divergence and its applications to DTI analysis. IEEE Transactions on Medical Imaging (2011) 10.1109/TMI.2010.2086464

    Google Scholar 

  49. Liu, M., Vemuri, B.C., Amari, S., Nielsen, F.: Shape retrieval using hierarchical total Bregman soft clustering. Transactions on Pattern Analysis and Machine Intelligence (2012)

    Google Scholar 

  50. Boissonnat, J.-D., Nielsen, F., Nock, R.: Bregman Voronoi diagrams. Discrete Comput. Geom. 44, 281–307

    Google Scholar 

  51. Nielsen, F., Boissonnat, J.-D., Nock, R.: On Bregman Voronoi diagrams. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2007, pp. 746–755. Society for Industrial and Applied Mathematics, Philadelphia

    Google Scholar 

  52. Nielsen, F., Boissonnat, J.-D., Nock, R.: Visualizing Bregman Voronoi diagrams. In: Proceedings of the Twenty-Third Annual Symposium on Computational Geometry, SCG 2007, pp. 121–122. ACM, New York

    Google Scholar 

  53. Nielsen, F., Nock, R.: Jensen-Bregman Voronoi diagrams and centroidal tessellations. In: International Symposium on Voronoi Diagrams (ISVD), pp. 56–65.

    Google Scholar 

  54. Nielsen, F.: Hypothesis testing, information divergence and computational geometry. In: Geometric Sciences of Information, GSI (2013)

    Google Scholar 

  55. Nielsen, F.: An information-geometric characterization of Chernoff information. IEEE Signal Processing Letters (SPL) 20, 269–272

    Google Scholar 

  56. Garcia, V., Nielsen, F.: Simplification and hierarchical representations of mixtures of exponential families. Signal Processing (Elsevier) 90, 3197–3212

    Google Scholar 

  57. Schwander, O., Nielsen, F.: PyMEF - A framework for exponential families in Python. In: IEEE/SP Workshop on Statistical Signal Processing, SSP (2011)

    Google Scholar 

  58. Shen, Z.: Riemann-Finsler geometry with applications to information geometry. Chinese Annals of Mathematics 27B, 73–94

    Google Scholar 

  59. Cena, A., Pistone, G.: Exponential statistical manifold. Annals of the Institute of Statistical Mathematics 59, 27–56

    Google Scholar 

  60. Gangbo, W., McCann, R.J.: The geometry of optimal transportation. Acta Math. 177, 113–161

    Google Scholar 

  61. Barbaresco, F.: Interactions between Symmetric Cone and Information Geometries: Bruhat-Tits and Siegel Spaces Models for High Resolution Autoregressive Doppler Imagery. In: Nielsen, F. (ed.) ETVC 2008. LNCS, vol. 5416, pp. 124–163. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  62. Dawid, A.P.: The geometry of proper scoring rules. Annals of the Institute of Statistical Mathematics 59, 77–93

    Google Scholar 

  63. Grasselli, M.R., Streater, R.F.: On the uniqueness of the Chentsov metric in quantum information geometry. Infinite Dimensional Analysis, Quantum Probability and Related Topics 4, 173–181, arXiv.org:math-ph/0006030

    Google Scholar 

  64. Nielsen, F.: A family of statistical symmetric divergences based on Jensen’s inequality. CoRR abs/1009.4004 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nielsen, F. (2013). Pattern Learning and Recognition on Statistical Manifolds: An Information-Geometric Review. In: Hancock, E., Pelillo, M. (eds) Similarity-Based Pattern Recognition. SIMBAD 2013. Lecture Notes in Computer Science, vol 7953. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39140-8_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39140-8_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39139-2

  • Online ISBN: 978-3-642-39140-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics