Probabilistic Aspects in Classification

  • Hans H. Bock
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


This paper surveys various ways in which probabilistic approaches can be useful in partitional (‘non-hierarchical’) cluster analysis. Four basic distribution models for ‘clustering structures’ are described in order to derive suitable clustering strategies. They are exemplified for various special distribution cases, including dissimilarity data and random similarity relations. A special section describes statistical tests for checking the relevance of a calculated classification (e.g., the max-F test, convex cluster tests) and comparing it to standard clustering situations (comparative assessment of classifications, CAC).


Mixture Model Cluster Model Cluster Structure Cluster Validation Cluster Criterion 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Anderson, J.J. (1985): Normal mixtures and the number of clusters problem. Computational Statistics Quarterly 2, 3–14.Google Scholar
  2. P. Arabie, L. Hubert and G. De Soete (eds.) (1996): Clustering and Classification. World Science Publishers, River Edge NJ.Google Scholar
  3. Baubkus, W. (1985): Minimizing the variance criterion in cluster analysis: Optimal configurations in the multidimensional normal case. Diploma thesis, Institute of Statistics, Technical University of Aachen, Germany.Google Scholar
  4. Berdai, A., and B. Garel (1994): Performances d’un test d’homogénéité contre une hypothèse de mélange gaussien. Revue de Statistique Appliquée 42 (1), 63–79.MathSciNetMATHGoogle Scholar
  5. Bernardo, J.M. (1994): Optimizing prediction with hierarchical models: Bayesian clustering. In: P.R. Freeman, A.F.M. Smith (Eds.): Aspects of uncertainty. Wiley, New York, 1994, 67–76.Google Scholar
  6. Binder, D.A. (1978): Bayesian cluster analysis. Biometrika 65, 31–38.MathSciNetMATHCrossRefGoogle Scholar
  7. Bock, H.H. (1968): Statistische Modelle für die einfache und doppelte Klassifikation von normalverteilten Beobachtungen. Dissertation, Univ. Freiburg i. Brsg., Germany.Google Scholar
  8. Bock, H.H. (1969): The equivalence of two extremal problems and its application to the iterative classification of multivariate data. Report of the Conference ‘Medizinische Statistik’, Forschungsinstitut Oberwolfach, February 1969, lOpp.Google Scholar
  9. Bock, H.H. (1972): Statistische Modelle und Bayes’sche Verfahren zur Bestimmung einer unbekannten Klassifikation normalverteilter zufälliger Vektoren. Metrika 18 (1972) 120–132.MathSciNetMATHCrossRefGoogle Scholar
  10. Bock, H.H. (1974): Automatische Klassifikation (Clusteranalyse). Vandenhoeck Ruprecht, Göttingen, 480 pp.Google Scholar
  11. Bock, H.H. (1977): On tests concerning the existence of a classification. In: Proc. First Symposium on Data Analysis and Informatics, Versailles, 1977, Vol. II. Institut de Recherche d’Informatique et d’Automatique ( IRIA ), Le Chesnay, 1977, 449–464.Google Scholar
  12. Bock, H.H. (1984): Statistical testing and evaluation methods in cluster analysis. In: J.K. Ghosh J. Roy (Eds.): Golden Jubilee Conference in Statistics: Applications and new directions. Calcutta, December 1981. Indian Statistical Institute, Calcutta, 1984, 116–146.Google Scholar
  13. Bock, H.H. (1985): On some significance tests in cluster analysis. J. of Classification 2, 77–108.MathSciNetMATHCrossRefGoogle Scholar
  14. Bock, H.H. (1986): Loglinear models and entropy clustering methods for qualitative data. In: W. Gaul, M. Schader (Eds.), Classification as a tool of research. North Holland, Amsterdam, 1986, 19–26.Google Scholar
  15. Bock, H.H. (1987): On the interface between cluster analysis, principal component analysis, and multidimensional scaling. In: H. Bozdogan and A.K. Gupta (eds.): Multivariate statistical modeling and data analysis. Reidel, Dordrecht, 1987, 17–34.Google Scholar
  16. Bock, H.H. (Ed.) (1988): Classification and related methods of data analysis. Proc. First IFCS Conference, Aachen, 1987. North Holland, Amsterdam.Google Scholar
  17. Bock, H.H. (1989a): Probabilistic aspects in cluster analysis. In: O. Opitz (Ed.): Conceptual and numerical analysis of data. Springer-Verlag, Heidelberg, 1989, 12–44.Google Scholar
  18. Bock, H.H. (1989b): A probabilistic clustering model for graphs and similarity relations. Paper presented at the Fall Meeting 1989 of the Working Group ‘Numerical Classification and Data Analysis’ of the Gesellschaft für Klassifikation, Essen, November 1989.Google Scholar
  19. Bock, H.H. (1994): Information and entropy in cluster analysis. In: H. Bozdogan et al. (Eds.): Multivariate statistical modeling, Vol. II. Proc. 1st US Japan Conference on the Frontiers of Statistical Modeling: An Informational Approach. Univ. of Tennessee, Knoxville, 1992. Kluwer, Dordrecht, 1994, 115–147.Google Scholar
  20. Bock, H.H. (1996a): Probability models and hypotheses testing in partitioning cluster analysis. In: P. Arabie et al. (Eds.), 1996, 377–453.Google Scholar
  21. Bock, H.H. (1996b): Probabilistic models in cluster analysis. Computational Statistics and Data Analysis 22 (in press).Google Scholar
  22. Bock, H.H. (1996c): Probabilistic models in partitional cluster analysis. In: A. Ferligoj and A. Kramberger (Eds.): Developments in data analysis. Metodoloski zvezki, 12, Faculty of Social Sciences Press (Fakulteta za druzbene vede, FDV), Ljubljana, 1996, 3–25.Google Scholar
  23. Bock, H.H. (1996d): Probabilistic models and statistical methods in partitional classification problems. Written version of a Tutorial Session organized by the Japanese Classification Society and the Japan Market Association, Tokyo, April 2–3, 1996, 50–68.Google Scholar
  24. Bock, H.H. (1997): Probability models for convex clusters. In: R. Klar and O. Opitz (Eds.): Classification and knowledge organization. Springer-Verlag, Heidelberg, 1997 (to appear).Google Scholar
  25. Bock, H.H., and W. Polasek (Eds.) (1996): Data analysis and information systems: Statistical and conceptual approaches. Springer-Verlag, Heidelberg, 1996.Google Scholar
  26. Böhning, D., Dietz, E., Schaub, R., Schlattmann, P., Lindsay, B.G. (1994): The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family. Annals of the Institute of Mathematical Statistics 46, 373–388.MATHCrossRefGoogle Scholar
  27. Bryant, P. (1988): On characterizing optimization-based clustering methods. J. of Classification 5, 81–84.CrossRefGoogle Scholar
  28. Bryant, P.G. (1991): Large-sample results for optimization-based clustering methods. J. of Classification 8, 31–44.MATHCrossRefGoogle Scholar
  29. Bryant, P.G., and J.A. Williamson (1978): Asymptotic behaviour of classification maximum likelihood estimates. Biometrika 65. 273–281.MATHCrossRefGoogle Scholar
  30. Céleux, G., Diebolt, J. (1985): The SEM algorithm: A probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Computational Statistics Quarterly 2, 73–82. Cox, D.R. (1957): A note on grouping. J. Amer. Statist. Assoc. 52, 543–547.Google Scholar
  31. Cressie, N. (1991): Statistics for spatial data. Wiley, New York.Google Scholar
  32. Diday, E. (1973): Introduction à l’analyse factorielle typologique. Rapport de Recherche no. 27, IRIA, Le Chesnay, France, 13 pp.Google Scholar
  33. Diday, E., Y. Lechevallier, M. Schader, P. Bertrand, and B. Burtschy (Eds.) (1994): New approaches in classification and data analysis. Studies in Classification, Data Analysis, and Knowledge Organization, vol. 6. Springer-Verlag, Heidelberg, 186–193.Google Scholar
  34. Dubes, R., and Jain, A.K. (1979): Validity studies in clustering methodologies. Pattern Recognition 11, 235–254.MATHCrossRefGoogle Scholar
  35. Dubes, R.C., and Zeng, G. (1987): A test for spatial homogeneity in cluster analysis. J. of Classification 4, 33–56.CrossRefGoogle Scholar
  36. Everitt, B. S. (1981): A Monte Carlo investigation of the likelihood ratio test for the number of components in a mixture of normal distributions. Multivariate Behavioural Research 16, 171–180.CrossRefGoogle Scholar
  37. Fahrmeir, L., Hamerle, A. and G. Tutz (Eds.) (1996): Multivariate statistische Verfahren. Walter de Gruyter, Berlin - New York.Google Scholar
  38. Fahrmeir, L., Kaufmann, H.L., and H. Pape (1980): Eine konstruktive Eigenschaft optimaler Par- titionen bei stochastischen Klassifikationsproblemen. Methods of Operations Research 37, 337–347.MATHGoogle Scholar
  39. Flury, B.D. (1993): Estimation of principal points. Applied Statistics 42, 139–151.MathSciNetMATHCrossRefGoogle Scholar
  40. W. Gaul D. Pfeifer (Eds.) (1996): From data to knowledge. Theoretical and practical aspects of classification, data analysis and knowledge organization. Springer-Verlag, Heidelberg.Google Scholar
  41. Ghosh, J. K., Sen, P. K. (1985): On the asymptotic performance of the log likelihood ratio statis- tic for the mixture model and related results. In: L.M. LeCam, R.A. Ohlsen (Eds.): Proc. Berkeley Conference in honor of Jerzy Neyman and Jack Kiefer. Vol. II, Wadsworth, Monterey, 1985, 789–806.Google Scholar
  42. Godehardt, E. (1990): Graphs as structural models. The application of graphs and multigraphs in cluster analysis. Friedrich Vieweg Sohn, Braunschweig, 240 pp.Google Scholar
  43. Godehardt, E., and Borsch, A. (1996): Graph-theoretic models for testing the homogeneity of data. In: W. Gaul D. Pfeifer (Eds.), 1996, 167–176.Google Scholar
  44. Goffinet, B., Loisel, P., and B. Laurent (1992): Testing in normal mixture models when the proportions are known. Biometrika 79, 842–846.MathSciNetMATHCrossRefGoogle Scholar
  45. Gordon, A.D. (1994): Identifying genuine clusters in a classification. Computational Statistics and Data Analysis 18, 561–581.Google Scholar
  46. Gordon, A.D. (1996): Null models in cluster validation. In: W. Gaul and D. Pfeifer (Eds.), 1996, 32–44.Google Scholar
  47. Gordon, A.D. (1997a): Cluster validation. This volume.Google Scholar
  48. Gordon, A.D. (1997b): How many clusters? An investigation of five procedures for detecting nested cluster structure. This volume.Google Scholar
  49. Hardy, A. (1997): A split and merge algorithm for cluster analysis. This volume.Google Scholar
  50. Hartigan, J.A. (1978): Asymptotic distributions for clustering criteria. Ann. Statist. 6, 117–131. Hartigan, J.A. (1985): Statistical theory in clustering. J. of Classification 2, 63–76.MathSciNetGoogle Scholar
  51. Hayashi, Ch. (19??):Google Scholar
  52. Jain, A.K., and Dubes, R.C. (1988): Algorithms for clustering data. Prentice Hall, Englewood Cliffs, NJ.Google Scholar
  53. Jank, W. (1996): A study on the varaince criterion in cluster analysis: Optimum and stationary partitions of RP and the distribution of related clustering criteria. (In German). Diploma thesis, Institute of Statistics, Technical University of Aachen, Aachen, 204 pp.Google Scholar
  54. Jank, W., and Bock, H.H. (1996): Optimal partitions of R 2 and the distribution of the variance and max-F criterion. Paper presented at the 20th Annual Conference of the Gesellschaft für Klassifikation, Freiburg, Germany, March 1996.Google Scholar
  55. Lapointe, F.-J. (1997): To validate and how to validate? That is the real question. This volume. Ling, R.F. (1973): A probability theory of cluster analysis. J. Amer. Statist. Assoc. 68, 159–164.Google Scholar
  56. McLachlan, G.J., and K.E. Basford (1988): Mixture models. Inference and applications to clustering. Marcel Dekker, New York - Basel.Google Scholar
  57. Mendell, N.P., Thode, H.C., Finch, S.J. (1991): The likelihood ratio test for the two-component normal mixture problem: power and sample-size analysis. Biometrics 47, 1143–1148. Correction: 48 (1992) 661.Google Scholar
  58. Mendell, N.P., Finch, S.J., and Thode, H.C. (1993): Where is the likelihood ratio test powerful for detecting two-component normal mixtures? Biometrics 49, 907–915.MathSciNetCrossRefGoogle Scholar
  59. Milligan, G. W. (1981): A review of Monte Carlo tests of cluster analysis. Multivariate Behavioural Research 16, 379–401.Google Scholar
  60. Milligan, G.W. (1996): Clustering validation: Results and implications for applied analyses. In: P. Arabie et al. (Eds.), 1996, 341–375.Google Scholar
  61. Milligan, G. W., and M.C. Cooper (1985): An examination of procedures for determining the number of clusters in a data set. Psychometrika 50, 159–179.CrossRefGoogle Scholar
  62. Pärna, K. (1986): Strong consistency of k-means clustering criterion in separable metric spaces. Tartu Riikliku Ulikooli, TOIMEISED 733, 86–96.Google Scholar
  63. Kipper, S., and Pärna, K. (1992): Optimal k—centres for a two-dimensional normal distribution. Acta et Commentationes Universitatis Tartuensis, Tartu Ulikooli TOIMEISED 942, 21–27. Pollard, D. (1982): A central limit theorem for k-means clustering. Ann. Probab. 10, 919–926. Rasson, J.-P. ( 1997 ): Convexity methods in classification. This volume.Google Scholar
  64. Rasson, J.-P., Hardy, A., and Weverbergh, D. (1988): Point process, classification and data analysis.In: 1LH. Bock (Ed.), 1988, 245–256.Google Scholar
  65. Rasson, J.-P., and Kubushishi, T. (1994): The gap test: an optimal method for determining the number of natural classes in cluster analysis. In: E. Diday et al. (eds.), 1994, 186–193.Google Scholar
  66. Ripley, B.D. (1981): Spatial statistics. Wiley, New York.Google Scholar
  67. Sawitzki, G. (1996): The excess-mass approach and the analysis of multi-modality. In: W. Gaul and D. Pfeifer (Eds.), 1996, 203–211.Google Scholar
  68. Silverman, B.W. (1981): Using kernel density estimates to investigate multimodality. J. Royal Statist. Soc. B 43, 97–99.Google Scholar
  69. Snijders, T.A.B. and K. Nowicki (1996): Estimation and prediction for stochastic blockmodels for graphs with latent block structure. J. of Classification 13 (in press).Google Scholar
  70. Symons, M.J. (1981): Clustering criteria and multivariate normal mixtures. Biometrics 37, 35–43. Tharpey, Th., Li, L., Flury, B.D. (1995): Principal points and self-consistent points of elliptical distributions. Annals of Statistics 23, 103–112.Google Scholar
  71. Thode, H.C., Finch, S.J., Mendell, N.R. (1988): Simulated percentage points for the null distribution of the likelihood ratio test for a mixture of two normals. Biometrics 44, 1195–1201.MathSciNetMATHCrossRefGoogle Scholar
  72. Titterington, D.M. (1990): Some recent research in the analysis of mixture distributions. Statistics 21, 619–641.MathSciNetMATHCrossRefGoogle Scholar
  73. Titterington, D.M., A.F.M. Smith and U.E. Makov (1985): Statistical analysis of finite mixture distributions. Wiley, New York.Google Scholar
  74. Van Cutsem, B., and Ycart, B. (1996a): Probability distributions on indexed dendrograms and related problems of classifiability. In H.H. Bock and W. Polasek (Eds.), 1996, 73–87.Google Scholar
  75. Van Cutsem, B., and Ycart, B. (19966): Combinatorial structures and structures for classification. Computational Statistics and Data Analysis (in press).Google Scholar
  76. Van Cutsem, B., and Ycart, B. (1997): This volume.Google Scholar

Copyright information

© Springer Japan 1998

Authors and Affiliations

  • Hans H. Bock
    • 1
  1. 1.Institute of StatisticsTechnical University of AachenAachenGermany

Personalised recommendations