Abstract
Outliers are observations that are particularly discordant with respect to others, lying hence on the periphery of the data region. In the literature, many tools have been proposed with the aim of detecting multiple outliers. Most of the recent and attractive methods are based on some measure of the distance of each data point from a center. However, they are really effective only if the shape of the data scatter is symmetrical with respect to such a center. Otherwise, asymmetry will make these measures misleading. For this reason, we propose a method that allows direct exploration of the periphery of the data scatter, without considering any center. The methodology we propose is based on a two-step procedure that exploits the sample convex hull and radial projections. It explores gaps in the data scatter and proximities to its boundary, highlighting how the data structure is sparse at its periphery. A complementary graphical display is finally offered as a useful tool to visualize boundary features.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
ATKINSON, A.C. (1994): Fast Very Robust Methods for the Detection of Multiple Outliers. Journal of the American Statistical Society, 89, 1329–1339.
BARNETT, V. (1976): The ordering of multivariate data (with discussion). Journal of Royal Statistical Society A, 139, 318–54.
BARNETT, V. and LEWIS T.(1994): Outliers in Statistical Data (3rd ed.). Wiley, New York.
HADI, A.S. (1992): Identifying Multiple Outliers in Multivariate Data. Journal of Royal Statistical Society, Ser.B, 54, 761–771.
MAHALANOBIS, P.C. (1936): On the Generalized Distance in Statistics. Proc. Nat Inst. Sci. India A2, 49–55.
ROHLF, F.J. (1975): Generalization of the gap test for the detection of multivariate outliers, Biometrics. 31, 93–101.
ROUSSEEUW, P.J. and van ZOMEREN, B.C. (1990): Unmasking Multivariate Outliers and Leverage Points. Journal of the American Statistical Society, 85, 633–639.
WILKS, S.S.(1963): Multivariate Statistical Outliers. Sankhya, Ser. A, 25, 407–426.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin · Heidelberg
About this paper
Cite this paper
Porzio, G.C., Ragozini, G. (2000). Exploring the Periphery of Data Scatters: Are There Outliers?. In: Kiers, H.A.L., Rasson, JP., Groenen, P.J.F., Schader, M. (eds) Data Analysis, Classification, and Related Methods. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-59789-3_38
Download citation
DOI: https://doi.org/10.1007/978-3-642-59789-3_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67521-1
Online ISBN: 978-3-642-59789-3
eBook Packages: Springer Book Archive