Skip to main content

Size and Power of Multivariate Outlier Detection Rules

  • Conference paper
  • First Online:
Algorithms from and for Nature and Life

Abstract

Multivariate outliers are usually identified by means of robust distances. A statistically principled method for accurate outlier detection requires both availability of a good approximation to the finite-sample distribution of the robust distances and correction for the multiplicity implied by repeated testing of all the observations for outlyingness. These principles are not always met by the currently available methods. The goal of this paper is thus to provide data analysts with useful information about the practical behaviour of some popular competing techniques. Our conclusion is that the additional information provided by a data-driven level of trimming is an important bonus which ensures an often considerable gain in power.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The Authors are grateful to Dr. Spyros Arsenis and Dr. Domenico Perrotta for pointing out this historical reference.

  2. 2.

    In the RRCOV packege of the R software this option is called eff.shape

References

  • Atkinson, A. C., & Riani, M. (2000). Robust diagnostic regression analysis. New-York: Springer.

    Book  MATH  Google Scholar 

  • Atkinson, A. C., Riani, M., & Cerioli, A. (2004). Exploring multivariate data with the forward search. New York: Springer.

    Book  MATH  Google Scholar 

  • Cerioli, A. (2010). Multivariate outlier detection with high-breakdown estimators. Journal of the American Statistical Association, 105, 147–156.

    Article  MathSciNet  Google Scholar 

  • Cerioli, A., & Farcomeni, A. (2011). Error rates for multivariate outlier detection. Computational Statistics and Data Analysis, 55, 544–553.

    Article  MathSciNet  MATH  Google Scholar 

  • Cerioli, A., Riani, M., & Atkinson, A. C. (2009). Controlling the size of multivariate outlier tests with the MCD estimator of scatter. Statistics and Computing, 19, 341–353

    Article  MathSciNet  Google Scholar 

  • Cerioli, A., Atkinson, A. C., & Riani, M. (2011a). Some perspectives on multivariate outlier detection. In S. Ingrassia, R. Rocci, & M. Vichi (Eds.), New perspectives in statistical modeling and data analysis (pp. 231–238). Berlin/Heidelberg: Springer.

    Chapter  Google Scholar 

  • Cerioli, A., Riani, M., & Torti, F. (2011b). Accurate and powerful multivariate outlier detection. 58th congress of ISI, Dublin.

    Google Scholar 

  • Hadi, A. S., Rahmatullah Imon, A. H. M., & Werner, M. (2009). Detection of outliers. WIREs Computational Statistics, 1, 57–70.

    Article  Google Scholar 

  • Hubert, M., Rousseeuw, P. J., & Van aelst, S. (2008). High-breakdown robust multivariate methods. Statistical Science, 23, 92–119.

    Google Scholar 

  • Maronna, R. A., Martin, D. G., & Yohai, V. J. (2006). Robust statistics. New York: Wiley.

    Book  MATH  Google Scholar 

  • Morgenthaler, S. (2006). A survey of robust statistics. Statistical Methods and Applications, 15, 271–293 (Erratum 16, 171–172).

    Google Scholar 

  • Perrotta, D., Riani, M., & Torti, F. (2009). New robust dynamic plots for regression mixture detection. Advances in Data Analysis and Classification, 3, 263–279.

    Article  MathSciNet  Google Scholar 

  • Pison, G., Van aelst, S., & Willems, G. (2002). Small sample corrections for LTS and MCD. Metrika, 55, 111–123.

    Google Scholar 

  • Riani, M., Atkinson, A. C., & Cerioli, A. (2009). Finding an unknown number of multivariate outliers. Journal of the Royal Statistical Society B, 71, 447–466.

    Article  MathSciNet  MATH  Google Scholar 

  • Riani, M., Torti, F., & Zani, S. (2011). Outliers and robustness for ordinal data. In R. S. Kennet & S. Salini (Eds.), Modern analysis of customer satisfaction surveys: with applications using R. Chichester: Wiley.

    Google Scholar 

  • Riani, M., Cerioli, A., & Torti, F. (2012). A new look at consistency factors and efficiency of robust scale estimators. Submitted.

    Google Scholar 

  • Rousseeuw, P. J., & Leroy, A. M. (1987). Robust regression and outlier detection. New York: Wiley.

    Book  MATH  Google Scholar 

  • Rousseeuw, P. J. & Van Driessen, K. (1999). A fast algorithm for the minimum covariance determinant estimator. Technometrics, 41, 212–223.

    Article  Google Scholar 

  • Salibian-barrera, M., Van Aelst, S., & Willems, G. (2006). Principal components analysis based on multivariate mm estimators with fast and robust bootstrap. Journal of the American Statistical Association, 101, 1198–1211.

    Article  MathSciNet  MATH  Google Scholar 

  • Todorov, V., & Filzmoser, P. (2009). An object-oriented framework for robust multivariate analysis. Journal of Statistical Software, 32, 1–47.

    Google Scholar 

  • Wilks, S. S. (1963). Multivariate statistical outliers. Sankhya A, 25, 407–426.

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors thank the financial support of the project MIUR PRIN MISURA - Multivariate models for risk assessment.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrea Cerioli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Cerioli, A., Riani, M., Torti, F. (2013). Size and Power of Multivariate Outlier Detection Rules. In: Lausen, B., Van den Poel, D., Ultsch, A. (eds) Algorithms from and for Nature and Life. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-00035-0_1

Download citation

Publish with us

Policies and ethics