Skip to main content

The Concept of α-Outliers in Structured Data Situations

  • Chapter
Robustness and Complex Data Structures

Abstract

Ever since the first data sets have been collected and analyzed by specialists and scientists, the question of which observations are “normal” and which are not has been asked. There is a considerable amount of uncertainty and opacity in data analyses where authors claim that certain observations do not fit to the rest of the data and have therefore been removed or analyzed more accurately. However, no unique definition of the term “outlier” exists. Numerous proposals for this issue have been made. In this chapter we discuss the model-based concept of α-outliers, which predicates on the density of the assumed probability distribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Barnett, V., & Lewis, T. (1994). Outliers in Statistical Data (3rd ed.). Chichester: Wiley & Sons.

    MATH  Google Scholar 

  • Becker, C., & Gather, U. (1999). The masking breakdown point of multivariate outlier identification rules. Journal of the American Statistical Association, 94, 947–955.

    Article  MathSciNet  MATH  Google Scholar 

  • Becker, C., & Gather, U. (2001). The largest nonidentifiable outlier: A comparison of multivariate simultaneous outlier identification rules. Computational Statistics & Data Analysis, 36, 119–127.

    Article  MathSciNet  MATH  Google Scholar 

  • Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. (1975). Discrete Multivariate Analysis. Cambridge: MIT Press.

    MATH  Google Scholar 

  • Boscher, H. (1992). Behandlung von Ausreißern in linearen Regressionsmodellen. Dissertation, Universität Dortmund.

    Google Scholar 

  • Christmann, A. (1992). Ausreißeridentifikation und robuste Schätzer im logistischen Regressionsmodell. Dissertation, Universität Dortmund.

    Google Scholar 

  • Davies, P. L., & Gather, U. (1993). The identification of multiple outliers. Journal of the American Statistical Association, 88, 782–792.

    Article  MathSciNet  MATH  Google Scholar 

  • Fuchs, C., & Kenett, R. (1980). A test for detecting outlying cells in the multinomial distribution and two-way contingency tables. Journal of the American Statistical Association, 75, 395–398.

    Article  MathSciNet  MATH  Google Scholar 

  • Gather, U., Bauer, M., & Fried, R. (2002). The identification of multiple outliers in online monitoring data. Estadística, 54, 289–338.

    MathSciNet  MATH  Google Scholar 

  • Gather, U., Kuhnt, S., & Pawlitschko, J. (2003). Concepts of outlyingness for various data structures. In J. C. Misra (Ed.), Industrial Mathematics and Statistics (pp. 545–585). New Dehli: Narosa Publishing House.

    Google Scholar 

  • Kuhnt, S. (2004). Outlier identification procedures for contingency tables using maximum likelihood and L 1 estimates. Scandinavian Journal of Statistics, 31, 431–442.

    Article  MathSciNet  Google Scholar 

  • Kuhnt, S. (2006). Robust graphical modelling for mixed variables.

    Google Scholar 

  • Kuhnt, S. (2010). Breakdown concepts for contingency tables. Metrika, 71, 281–294.

    Article  MathSciNet  MATH  Google Scholar 

  • Kuhnt, S., & Becker, C. (2003). Sensitivity of graphical modeling against contamination. In M. Schader, W. Gaul, & M. Vichi (Eds.), Between Data Science and Applied Data Analysis (pp. 279–287). Berlin: Springer.

    Chapter  Google Scholar 

  • Kuhnt, S., & Pawlitschko, J. (2005). Outlier identification rules for generalized linear models. In D. Baier & K.-D. Wernecke (Eds.), Innovations in Classification, Data Science, and Information Systems (pp. 165–172). Berlin: Springer.

    Chapter  Google Scholar 

  • Lauritzen, S. L. (1996). Graphical Models. Oxford: Clarendon Press.

    Google Scholar 

  • Lauritzen, S. L., & Wermuth, N. (1989). Graphical models for associations between variables, some of which are qualitative and some quantitative. The Annals of Statistics, 17, 31–57.

    Article  MathSciNet  MATH  Google Scholar 

  • Rehage, A., Rudak, N., Hussong, B., Kuhnt, S., & Tillmann, W. (2012). Prediction of in-flight particle properties in thermal spraying with additive day-effects. Discussion Paper 06/12, SFB 823, TU Dortmund University.

    Google Scholar 

  • Rousseeuw, P. J. (1984). Least median of squares regression. Journal of the American Statistical Association, 79, 871–880.

    Article  MathSciNet  MATH  Google Scholar 

  • Rousseeuw, P. J., & Leroy, A. M. (1987). Robust regression and outlier detection. New York: Wiley.

    Book  MATH  Google Scholar 

  • Rousseeuw, P. J., & van Zoomeren, B. C. (1990). Unmasking multivariate outliers and leverage points. Journal of the American Statistical Association, 85, 633–639.

    Article  Google Scholar 

  • Rousseeuw, P. J., & van Driessen, K. (1999). A fast algorithm for the minimum covariance determinant estimator. Technometrics, 41, 212–223.

    Article  Google Scholar 

  • Schultze, V., & Pawlitschko, J. (2000). Identification of outliers in exponential samples with stepwise procedures. Technical Report 56/00, SFB 475, Universität Dortmund.

    Google Scholar 

  • Schultze, V., & Pawlitschko, J. (2002). The identification of outliers in exponential samples. Statistica Neerlandica, 56, 41–57.

    Article  MathSciNet  MATH  Google Scholar 

  • Shane, K. V., & Simonoff, J. S. (2001). A robust approach to categorical data analysis. Journal of Computational and Graphical Statistics, 10, 135–157.

    Article  MathSciNet  Google Scholar 

  • Simonoff, J. S. (1988). Detecting outlying cells in two-way contingency tables via backwards-stepping. Technometrics, 30, 339–345.

    Article  Google Scholar 

  • Tietjen, G. L., & Moore, R. H. (1972). Testing for a single outlier in simple linear regression. Technometrics, 15, 583–597.

    Article  Google Scholar 

  • Upton, G. J. G., & Guillen, M. (1995). Perfect cells, direct models and contingency table outliers. Communications in Statistics. Theory and Methods, 24, 1843–1862.

    Article  MathSciNet  MATH  Google Scholar 

  • Vogel, D., & Fried, R. (2010). On robust Gaussian graphical modelling. In L. Devroye, B. Karasözen, M. Kohler, & R. Korn (Eds.), Recent Developments in Applied Probability and Statistics (pp. 155–182). Berlin: Springer.

    Chapter  Google Scholar 

  • Wellmann, J., & Gather, U. (2003). Identification of outliers in a one-way random effects model. Statistical Papers, 44, 335–348.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The financial support of the Deutsche Forschungsgemeinschaft (SFB 475, project A1 and SFB 823, project B1) is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sonja Kuhnt .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Kuhnt, S., Rehage, A. (2013). The Concept of α-Outliers in Structured Data Situations. In: Becker, C., Fried, R., Kuhnt, S. (eds) Robustness and Complex Data Structures. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35494-6_6

Download citation

Publish with us

Policies and ethics