Abstract
Suppose that we are given a list of n ℝp-valued observations and a natural number r ≤ n. Further, assume that r of them arise from any one of g normally distributed populations, whereas the other n — r observations are assumed to be contaminations. We develop estimators which simultaneously detect n — r outliers and partition the remaining r observations in g clusters. We analyze under which conditions these estimators are maximum likelihood estimators. Finally, we propose algorithms that approximate these estimators.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
CUESTA-ALBERTOS, J. A., GORDALIZA, A. and MATRAN, C. (1997): Trimmed k-Means: An attempt to robustify quantizers, The Annals of Statistics, Vol. 25, No. 2, 553–576.
BOCK, H. H. (1996): Probabilistic models and statistical methods in partition-al classifications problems. Tutorial session. Fifth Conference of International Federation of Classification Societies. April 2–3, 1996, Tokyo, Japan.
FRIEDMAN, H. P. and RUBIN, J. (1967): On Some Invariant Criteria for Grouping Data. Journal of the American Statistical Association, 62, 1159–1178.
GALLEGOS, M. T. (2000): A Robust Method for Clustering Analysis. Technical Report MIP-0013 October 2000. Fakultät für Mathematik und Informatik, Universität Passau
GALLEGOS, M. T. (2001): Robust clustering under general normal assumptions. Technical Report MIP-0103 September 2001. Fakultät für Mathematik und Informatik, Universität Passau.
MARDIA, K. V., KENT, J. T. and BIBBY, J. M. (1979): Multivariate Analysis. Academic Press, London, New York, Toronto, Sydney, San Francisco.
MATHAR, R. (1981): Ausreier bei ein- und mehrdimensionalen Wahrschein-lichkeitsverteilungen. Dissertation, Mathematisch-Naturwissen schaftliche Fakultät der Rheinisch-Westfälichen Technischen Hochschule, Aachen.
ROUSSEEUW, P. J. (1984): Least Median of Squares Regression. Journal of the American Statistical Association, 79, 871–880.
ROUSSEEUW, P. J. and VAN DRIESSEN, K. (1999): A Fast algorithm for the Minimum Covariance Determinant Estimator, Technometrics, Vol. 41, 212–223.
SPÄTH, H. (1985): Cluster Dissection and Analysis. Theory, FORTRAN programs, examples. Ellis Horwood, Chichester.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gallegos, M.T. (2002). Maximum Likelihood Clustering with Outliers. In: Jajuga, K., Sokołowski, A., Bock, HH. (eds) Classification, Clustering, and Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-56181-8_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-56181-8_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43691-1
Online ISBN: 978-3-642-56181-8
eBook Packages: Springer Book Archive