Chapter

Outlier Analysis

pp 149-184

Date:

High-Dimensional Outlier Detection: The Subspace Method

  • Charu C. AggarwalAffiliated withIBM T.J. Watson Research Center

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Many real data sets are very high dimensional. In some scenarios, real data sets may contain hundreds or thousands of dimensions. With increasing dimensionality, many of the conventional outlier detection methods do not work very effectively. This is an artifact of the well-known curse of dimensionality. In high-dimensional space, the data becomes sparse, and the true outliers become masked by the noise effects of multiple irrelevant dimensions, when analyzed in full dimensionality.