Skip to main content

Local Outlier Detection Algorithm Based on Gaussian Kernel Density Function

  • Conference paper
  • First Online:
Computational Intelligence and Intelligent Systems (ISICA 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 986))

Included in the following conference series:

  • 518 Accesses

Abstract

With the rapid development of information technology, the structure of data resources is becoming more and more complex, and outlier mining is attracting more and more attention. Based on Gaussian kernel function, this paper considers three neighbors: k nearest neighbors, reverse k neighbors and shared nearest neighbors. A local outlier detection algorithm based on Gaussian kernel function is proposed. Firstly, the algorithm stores the nearest neighbors of each data object through kNN maps, including k-nearest neighbors, reverse k-nearest neighbors, and shared nearest neighbors, forming a kernel neighbor set S. Secondly, Estimating density of data objects through kernel density estimation KDE method. Finally, the relative density outlier factor RDOF is used to estimate the degree of data objects deviating from the neighborhood, and then determines whether the data objects are outliers, and the validity of the algorithm is proved on the real and synthetic data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aggarwal, C.: Outlier Analysis, pp. 75–99. Springer, Germany (2015). https://doi.org/10.1007/978-1-4614-6396-2_3

    Book  Google Scholar 

  2. Braun, T.D., Siegal, H.J., Beck, N., et al.: A comparison study of static mapping heuristics for a class of meta-tasks on heterogeneous computing systems. In: Eighth Heterogeneous Computing Workshop. IEEE Computer Society (1999)

    Google Scholar 

  3. Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann, Burlington (2006). 5(4), 1–18

    MATH  Google Scholar 

  4. Pham, N., Pagh, R.: A near-linear time approximation algorithm for angle-based outlier detection in high-dimensional data. In: ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM (2012)

    Google Scholar 

  5. Tang, J., Chen, Z., Fu, A.W., Cheung, D.W.: Enhancing effectiveness of outlier detections for low density patterns. In: Chen, M.S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 535–548. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-47887-6_53

    Chapter  Google Scholar 

  6. Qian, X.Z., Deng, J., Qian, H., et al.: An efficient density biased sampling algorithm for clustering large high-dimensional datasets. Int. J. Pattern Recognit Artif Intell. 29(08), 1550026 (2015)

    Article  MathSciNet  Google Scholar 

  7. Han, J.W., Micheline, K.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann Publishers, San Francisco (2006)

    MATH  Google Scholar 

  8. Muller, E., Sanchez, P.I., Mulle, Y., et al.: Ranking outlier nodes in subspaces of attributed graphs (2013)

    Google Scholar 

  9. Hoeting, J., Raftery, A.E., Madigan, D.: A method for simultaneous variable selection and outlier identification in linear regression. Comput. Stat. Data Anal. 54(12), 3181–3193 (1996)

    MATH  Google Scholar 

  10. Knorr, E.M., Tucakov, V., Tucakov, V.: Distance-based outliers: algorithms and applications. VLDB J.—Int. J. Very Large Data Bases 8, 237–253 (2000)

    Article  Google Scholar 

  11. Zhang, H., Wu, Q., Pu, J.: A novel fuzzy kernel clustering algorithm for outlier detection. In: International Conference on Mechatronics & Automation. IEEE (2007)

    Google Scholar 

  12. Pamula, R., Deka, J.K., Nandi, S.: An Outlier Detection Method Based on Clustering (2011)

    Google Scholar 

  13. Nguyen, H.V., Müller, E., Vreeken, J., et al.: CMI: an information-theoretic contrast measure for enhancing subspace cluster and outlier detection. In: SDM, pp. 198–206 (2013)

    Google Scholar 

  14. Zhou, S., Zhao, Y., Guan, J., Huang, J.: A neighborhood-based clustering algorithm. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 361–371. Springer, Heidelberg (2005). https://doi.org/10.1007/11430919_43

    Chapter  Google Scholar 

  15. Wu, S., Wang, S.: Information-theoretic outlier detection for large-scale categorical data. IEEE Trans. Knowl. Data Eng. 25(3), 589–602 (2013)

    Article  Google Scholar 

  16. Sun, P., Chawla, S., Arunasalam, B.: Mining for outliers in sequential databases. In: Proceedings of the Sixth SIAM International Conference on Data Mining, Bethesda, pp. 94–105 (2006)

    Google Scholar 

  17. Lazarus, D., Weinkauf, M., Diver, P.: Pacman profiling: a simple procedure to identify stratigraphic outliers in high-density deep-sea microfossil data. Paleobiology 38(1), 144–161 (2012)

    Article  Google Scholar 

  18. Zhang, K., Hutter, M., Jin, H.: A new local distance-based outlier detection approach for scattered real-world data. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS (LNAI), vol. 5476, pp. 813–822. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01307-2_84

    Chapter  Google Scholar 

  19. Bache, K., Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu.html

  20. Hettich, S., Bay, S., Musster, K., Winner, J.: KDD CUP (1999). http://kdd.isc.uci.edu/databases/kddcpu99/kddcpu99.html. Accessed 01 Sept 2011

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhongping Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, Z., Liu, J., Miao, C. (2019). Local Outlier Detection Algorithm Based on Gaussian Kernel Density Function. In: Peng, H., Deng, C., Wu, Z., Liu, Y. (eds) Computational Intelligence and Intelligent Systems. ISICA 2018. Communications in Computer and Information Science, vol 986. Springer, Singapore. https://doi.org/10.1007/978-981-13-6473-0_29

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-6473-0_29

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-6472-3

  • Online ISBN: 978-981-13-6473-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics