Advertisement

Angle-Based Outlier Detection Algorithm with More Stable Relationships

  • Xiaojie LiEmail author
  • Jian Cheng Lv
  • Dongdong Cheng
Conference paper
Part of the Proceedings in Adaptation, Learning and Optimization book series (PALO, volume 1)

Abstract

Outlier detection is very useful in many applications, such as fraud detection and network intrusion. The angle-based outlier detection (ABOD) method, proposed by Kriegel, plays an important role in identifying outliers in high-dimensional spaces. However, ABOD only considers the relationships between each point and its neighbors and does not consider the relationships among these neighbors, causing the method to identify incorrect outliers. In this paper, we provide a small but consistent improvement by replacing the relationships between each point and its neighbors with more stable relationships among neighbors. Compared with other related methods, which work best in either high or low-dimensional spaces, our method gives significant gains in both high and low-dimensional spaces. Experimental results on both synthetic and real-world datasets demonstrate the effectiveness of our method.

Keywords

Outlier detection outlier factor angle-based 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hawkins, D.M.: Identification of outliers, vol. 11. Springer (1980)Google Scholar
  2. 2.
    Han, J., Kamber, M., Pei, J.: Data mining: concepts and techniques. Morgan Kaufmann (2006)Google Scholar
  3. 3.
    Kriegel, H.P., Zimek, A., et al.: Angle-based outlier detection in high-dimensional data. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 444–452. ACM (2008)Google Scholar
  4. 4.
    Dang, X.H., Assent, I., Ng, R.T., Zimek, A., Schubert, E.: Discriminative features for identifying and interpreting outliers. In: Proc. ICDE (2014)Google Scholar
  5. 5.
    Kriegel, H.P., Kröger, P., Zimek, A.: Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering. ACM Transactions on Knowledge Discovery from Data (TKDD) 3(1), 1 (2009)CrossRefGoogle Scholar
  6. 6.
    Lv, J.C., Tan, K.K., Yi, Z., Huang, S.: Stability and chaos of a class of learning algorithms for ica neural networks. Neural Processing Letters 28(1), 35–47 (2008)CrossRefGoogle Scholar
  7. 7.
    Lv, J.C., Yi, Z., Tan, K.K.: Determination of the number of principal directions in a biologically plausible pca model. IEEE Trans. Neural Netw. 18(3), 910–916 (2007)CrossRefGoogle Scholar
  8. 8.
    Cheng Lv, J., Yi, Z., Tan, K.: Convergence analysis of xu’s lmser learning algorithm via deterministic discrete time system method. Neurocomputing 70(1), 362–372 (2006)CrossRefGoogle Scholar
  9. 9.
    Houle, M.E., Kriegel, H.-P., Kröger, P., Schubert, E., Zimek, A.: Can shared-neighbor distances defeat the curse of dimensionality? In: Gertz, M., Ludäscher, B. (eds.) SSDBM 2010. LNCS, vol. 6187, pp. 482–500. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  10. 10.
    Kriegel, H.P., Kröger, P., Zimek, A.: Outlier detection techniques. In: Tutorial at the 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining (2009)Google Scholar
  11. 11.
    Anscombe, F.J.: Rejection of outliers. Technometrics 2(2), 123–146 (1960)CrossRefzbMATHMathSciNetGoogle Scholar
  12. 12.
    Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: A survey. ACM Computing Surveys (CSUR) 41(3), 15 (2009)CrossRefGoogle Scholar
  13. 13.
    Barnett, V., Lewis, T.: Outliers in statistical data, vol. 3. Wiley, New York (1994)zbMATHGoogle Scholar
  14. 14.
    Rousseeuw, P.J., Driessen, K.V.: A fast algorithm for the minimum covariance determinant estimator. Technometrics 41(3), 212–223 (1999)CrossRefGoogle Scholar
  15. 15.
    Hautamäki, V., Kärkkäinen, I., Fränti, P.: Outlier detection using k-nearest neighbour graph. In: ICPR (3), pp. 430–433 (2004)Google Scholar
  16. 16.
    Knox, E.M., Ng, R.T.: Algorithms for mining distancebased outliers in large datasets. In: Proceedings of the International Conference on Very Large Data Bases, pp. 392–403. Citeseer (1998)Google Scholar
  17. 17.
    Knorr, E.M., Ng, R.T.: Finding intensional knowledge of distance-based outliers. In: VLDB, vol. 99, pp. 211–222 (1999)Google Scholar
  18. 18.
    Knorr, E.M., Ng, R.T.: A unified approach for mining outliers. In: Proceedings of the 1997 Conference of the Centre for Advanced Studies on Collaborative Research, p. 11. IBM Press (1997)Google Scholar
  19. 19.
    Fan, H., Zaïane, O.R., Foss, A., Wu, J.: A nonparametric outlier detection for effectively discovering top-N outliers from engineering data. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 557–566. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  20. 20.
    Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: Lof: identifying density-based local outliers. ACM Sigmod Record 29, 93–104 (2000)CrossRefGoogle Scholar
  21. 21.
    Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: Optics-of: Identifying local outliers. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 262–270. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  22. 22.
    Jin, W., Tung, A.K.H., Han, J., Wang, W.: Ranking outliers using symmetric neighborhood relationship. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 577–593. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  23. 23.
    Papadimitriou, S., Kitagawa, H., Gibbons, P.B., Faloutsos, C.: Loci: Fast outlier detection using the local correlation integral. In: Proceedings of the 19th International Conference on Data Engineering 2003, pp. 315–326. IEEE (2003)Google Scholar
  24. 24.
    Shewhart, W.A.: Economic control of quality of manufactured product, vol. 509. ASQ Quality Press (1931)Google Scholar
  25. 25.
    Jizba, R.: Measuring search effectiveness. Creighton University Health Sciences Library and Learning Resources Center (2000)Google Scholar
  26. 26.
    Fisher, R.A.: The use of multiple measurements in taxonomic problems. Annals of Eugenics 7(2), 179–188 (1936)CrossRefGoogle Scholar
  27. 27.
    Duda, R.O., Hart, P.E., et al.: Pattern classification and scene analysis, vol. 3. Wiley, New York (1973)zbMATHGoogle Scholar
  28. 28.
    Acuna, E., Rodriguez, C.: A meta analysis study of outlier detection methods in classification. Technical paper (2004)Google Scholar
  29. 29.
    Martinez, A.M., Benavente, R.: The ar face database. CVC Technical Report 24 (1998)Google Scholar
  30. 30.
    Thomaz, C.E., Giraldi, G.A.: A new ranking method for principal components analysis and its application to face image analysis. Image and Vision Computing 28(6), 902–913 (2010)CrossRefGoogle Scholar
  31. 31.
    Tenorio, E.Z., Thomaz, C.E.: Analise multilinear discriminante de formas frontais de imagens 2d de faceGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.The Machine Intelligence Laboratory, College of Computer ScienceSichuan UniversityChengduP.R. China

Personalised recommendations