Reduction in Execution Cost of k-Nearest Neighbor Based Outlier Detection Method

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 834)

Abstract

Outlier detection is an important task as it leads to the discovery of critical information in a variety of the application domains. The variants of k-nearest neighbor based outlier detection method have been successfully applied over decades. However, these approaches have high execution time as they compute a score (known as outlier score) for each data point. In this paper, we propose a method to reduce the execution time of k-nearest neighbor based algorithms. Proposed method quickly identifies the data points which are normal and therefore outlier score for such points need not be computed in further processing. The proposed method is generic and can be applied to improve the execution efficiency of many density-based and distance-based outlier detection methods. Proposed work is compared with other existing methods and the result shows that the proposed work outperforms other methods.

Keywords

Density based outlier detection method k-nearest neighbor LOF Execution time 

References

  1. 1.
    Hawkins, D.M.: Identification of Outliers, vol. 11. Chapman and Hall, London (1980)CrossRefGoogle Scholar
  2. 2.
    Knorr, E.M., Ng, R.T.: A unified notion of outliers: properties and computation. In: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining 1997 (KDD 1997), pp. 219–222 (1997)Google Scholar
  3. 3.
    Pamula, R.: Data Pruning Based Outlier Detection (Doctoral Dissertation) (2015). http://gyan.iitg.ernet.in/handle/123456789/631
  4. 4.
    Pamula, R., Deka, J.K., Nandi, S.: An outlier detection method based on clustering. In: Proceeding of International Conference on Emerging Applications of Information Technology, Kolkata, India (2011)Google Scholar
  5. 5.
    Breunig, M.M., Kriegel, H.-P., Ng, R.T, Sander, J.: LOF: identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data 2000 (SIGMOD 2000), pp. 93–104. ACM (2000)Google Scholar
  6. 6.
    Chiu, A.L., Fu, A.W.: Enhancements on local outlier detection. In: Proceedings of Seventh International Conference on Database Engineering and Applications Symposium, pp. 298–307. IEEE (2003)Google Scholar
  7. 7.
    Jin, W., Tung, A.K.H., Han, J., Wang, W.: Ranking outliers using symmetric neighborhood relationship. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 577–593. Springer, Heidelberg (2006).  https://doi.org/10.1007/11731139_68CrossRefGoogle Scholar
  8. 8.
    Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. (CSUR) 41(3), 15 (2009)CrossRefGoogle Scholar
  9. 9.
    Zhang, K., Hutter, M., Jin, H.: A new local distance-based outlier detection approach for scattered real-world data. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS (LNAI), vol. 5476, pp. 813–822. Springer, Heidelberg (2009).  https://doi.org/10.1007/978-3-642-01307-2_84CrossRefGoogle Scholar
  10. 10.
    Goldstein, M.: FastLOF: an expectation-maximization based local outlier detection algorithm. In: Proceeding of 21st International Conference on Pattern Recognition 2012 (ICPR 2012), pp. 2282–2285 (2012)Google Scholar
  11. 11.
    Tang, J., Chen, Z., Fu, A.W., Cheung, D.W.: Enhancing effectiveness of outlier detections for low density patterns. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 535–548. Springer, Heidelberg (2002).  https://doi.org/10.1007/3-540-47887-6_53CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.National Institute of Technology RourkelaRourkelaIndia

Personalised recommendations