Maximum Likelihood Clustering with Outliers

  • María Teresa Gallegos
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


Suppose that we are given a list of n p -valued observations and a natural number rn. Further, assume that r of them arise from any one of g normally distributed populations, whereas the other n — r observations are assumed to be contaminations. We develop estimators which simultaneously detect nr outliers and partition the remaining r observations in g clusters. We analyze under which conditions these estimators are maximum likelihood estimators. Finally, we propose algorithms that approximate these estimators.


Maximum Likelihood Estimator Determinant Criterion Robust Version Pure Cluster Minimum Covariance Determinant 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. CUESTA-ALBERTOS, J. A., GORDALIZA, A. and MATRAN, C. (1997): Trimmed k-Means: An attempt to robustify quantizers, The Annals of Statistics, Vol. 25, No. 2, 553–576.MathSciNetzbMATHCrossRefGoogle Scholar
  2. BOCK, H. H. (1996): Probabilistic models and statistical methods in partition-al classifications problems. Tutorial session. Fifth Conference of International Federation of Classification Societies. April 2–3, 1996, Tokyo, Japan.Google Scholar
  3. FRIEDMAN, H. P. and RUBIN, J. (1967): On Some Invariant Criteria for Grouping Data. Journal of the American Statistical Association, 62, 1159–1178.MathSciNetCrossRefGoogle Scholar
  4. GALLEGOS, M. T. (2000): A Robust Method for Clustering Analysis. Technical Report MIP-0013 October 2000. Fakultät für Mathematik und Informatik, Universität PassauGoogle Scholar
  5. GALLEGOS, M. T. (2001): Robust clustering under general normal assumptions. Technical Report MIP-0103 September 2001. Fakultät für Mathematik und Informatik, Universität Passau.Google Scholar
  6. MARDIA, K. V., KENT, J. T. and BIBBY, J. M. (1979): Multivariate Analysis. Academic Press, London, New York, Toronto, Sydney, San Francisco.zbMATHGoogle Scholar
  7. MATHAR, R. (1981): Ausreier bei ein- und mehrdimensionalen Wahrschein-lichkeitsverteilungen. Dissertation, Mathematisch-Naturwissen schaftliche Fakultät der Rheinisch-Westfälichen Technischen Hochschule, Aachen.Google Scholar
  8. ROUSSEEUW, P. J. (1984): Least Median of Squares Regression. Journal of the American Statistical Association, 79, 871–880.MathSciNetzbMATHCrossRefGoogle Scholar
  9. ROUSSEEUW, P. J. and VAN DRIESSEN, K. (1999): A Fast algorithm for the Minimum Covariance Determinant Estimator, Technometrics, Vol. 41, 212–223.CrossRefGoogle Scholar
  10. SPÄTH, H. (1985): Cluster Dissection and Analysis. Theory, FORTRAN programs, examples. Ellis Horwood, Chichester.zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • María Teresa Gallegos
    • 1
  1. 1.Fakultät für Mathematik und InformatikUniversität PassauPassauGermany

Personalised recommendations