Attribute Outlier Detection over Data Streams
Outlier detection is widely used in many data stream application, such as network intrusion detection, fraud detection, etc. However, most existing algorithms focused on detecting class outliers and there is little work on detecting attribute outliers, which considers the correlation or relevance among the data items. In this paper we study the problem of detecting attribute outliers within the sliding windows over data streams. An efficient algorithm is proposed to perform exact outlier detection. The algorithm relies on an efficient data structure, which stores only the necessary information and can perform updates incurred by data arrival and expiration with minimum cost. To address the problem of limited memory, we also present an approximate algorithm, which selectively drops data within the current window and at the same time maintains a maximum error bound. Extensive experiments are conducted and the results show that our algorithms are efficient and effective.
Keywordsattribute outlier date stream
Unable to display preview. Download preview PDF.
- 1.Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: VLDB 2003: Proceedings of the 29th international conference on Very large data bases, pp. 81–92. VLDB Endowment (2003)Google Scholar
- 3.Barnett, V., Lewis, T.: Outliers in statistical data (1984)Google Scholar
- 5.Cao, H., Zhou, Y., Shou, L., Chen, G.: Attribute outlier detection over data streams, 9 (2009), http://db.zju.edu.cn/wiki/index.php/Hui_Cao
- 9.Knorr, E.M., Ng, R.T.: A unified notion of outliers: Properties and computation. In: KDD, pp. 219–222 (1997)Google Scholar
- 10.Knorr, E.M., Ng, R.T.: Algorithms for mining distance-based outliers in large datasets. In: VLDB 1998: Proceedings of the 24th International Conference on Very Large Data Bases, pp. 392–403. Morgan Kaufmann Publishers Inc., San Francisco (1998)Google Scholar
- 11.Koh, J.L.Y., Lee, M.-L., Hsu, W., Ang, W.T.: Correlation-based attribute outlier detection in XML. In: ICDE 2008: Proceedings of the 24th International Conference on Data Engineering, pp. 1522–1524 (2008)Google Scholar