Local Outlier Detection Algorithm Based on Gaussian Kernel Density Function

Zhang, Zhongping; Liu, Jiaojiao; Miao, Chuangye

doi:10.1007/978-981-13-6473-0_29

Zhongping Zhang^12,13,
Jiaojiao Liu¹² &
Chuangye Miao¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 986))

Included in the following conference series:

International Symposium on Intelligence Computation and Applications

518 Accesses

Abstract

With the rapid development of information technology, the structure of data resources is becoming more and more complex, and outlier mining is attracting more and more attention. Based on Gaussian kernel function, this paper considers three neighbors: k nearest neighbors, reverse k neighbors and shared nearest neighbors. A local outlier detection algorithm based on Gaussian kernel function is proposed. Firstly, the algorithm stores the nearest neighbors of each data object through kNN maps, including k-nearest neighbors, reverse k-nearest neighbors, and shared nearest neighbors, forming a kernel neighbor set S. Secondly, Estimating density of data objects through kernel density estimation KDE method. Finally, the relative density outlier factor RDOF is used to estimate the degree of data objects deviating from the neighborhood, and then determines whether the data objects are outliers, and the validity of the algorithm is proved on the real and synthetic data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aggarwal, C.: Outlier Analysis, pp. 75–99. Springer, Germany (2015). https://doi.org/10.1007/978-1-4614-6396-2_3
Book Google Scholar
Braun, T.D., Siegal, H.J., Beck, N., et al.: A comparison study of static mapping heuristics for a class of meta-tasks on heterogeneous computing systems. In: Eighth Heterogeneous Computing Workshop. IEEE Computer Society (1999)
Google Scholar
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann, Burlington (2006). 5(4), 1–18
MATH Google Scholar
Pham, N., Pagh, R.: A near-linear time approximation algorithm for angle-based outlier detection in high-dimensional data. In: ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM (2012)
Google Scholar
Tang, J., Chen, Z., Fu, A.W., Cheung, D.W.: Enhancing effectiveness of outlier detections for low density patterns. In: Chen, M.S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 535–548. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-47887-6_53
Chapter Google Scholar
Qian, X.Z., Deng, J., Qian, H., et al.: An efficient density biased sampling algorithm for clustering large high-dimensional datasets. Int. J. Pattern Recognit Artif Intell. 29(08), 1550026 (2015)
Article MathSciNet Google Scholar
Han, J.W., Micheline, K.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann Publishers, San Francisco (2006)
MATH Google Scholar
Muller, E., Sanchez, P.I., Mulle, Y., et al.: Ranking outlier nodes in subspaces of attributed graphs (2013)
Google Scholar
Hoeting, J., Raftery, A.E., Madigan, D.: A method for simultaneous variable selection and outlier identification in linear regression. Comput. Stat. Data Anal. 54(12), 3181–3193 (1996)
MATH Google Scholar
Knorr, E.M., Tucakov, V., Tucakov, V.: Distance-based outliers: algorithms and applications. VLDB J.—Int. J. Very Large Data Bases 8, 237–253 (2000)
Article Google Scholar
Zhang, H., Wu, Q., Pu, J.: A novel fuzzy kernel clustering algorithm for outlier detection. In: International Conference on Mechatronics & Automation. IEEE (2007)
Google Scholar
Pamula, R., Deka, J.K., Nandi, S.: An Outlier Detection Method Based on Clustering (2011)
Google Scholar
Nguyen, H.V., Müller, E., Vreeken, J., et al.: CMI: an information-theoretic contrast measure for enhancing subspace cluster and outlier detection. In: SDM, pp. 198–206 (2013)
Google Scholar
Zhou, S., Zhao, Y., Guan, J., Huang, J.: A neighborhood-based clustering algorithm. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 361–371. Springer, Heidelberg (2005). https://doi.org/10.1007/11430919_43
Chapter Google Scholar
Wu, S., Wang, S.: Information-theoretic outlier detection for large-scale categorical data. IEEE Trans. Knowl. Data Eng. 25(3), 589–602 (2013)
Article Google Scholar
Sun, P., Chawla, S., Arunasalam, B.: Mining for outliers in sequential databases. In: Proceedings of the Sixth SIAM International Conference on Data Mining, Bethesda, pp. 94–105 (2006)
Google Scholar
Lazarus, D., Weinkauf, M., Diver, P.: Pacman profiling: a simple procedure to identify stratigraphic outliers in high-density deep-sea microfossil data. Paleobiology 38(1), 144–161 (2012)
Article Google Scholar
Zhang, K., Hutter, M., Jin, H.: A new local distance-based outlier detection approach for scattered real-world data. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS (LNAI), vol. 5476, pp. 813–822. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01307-2_84
Chapter Google Scholar
Bache, K., Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu.html
Hettich, S., Bay, S., Musster, K., Winner, J.: KDD CUP (1999). http://kdd.isc.uci.edu/databases/kddcpu99/kddcpu99.html. Accessed 01 Sept 2011

Download references

Author information

Authors and Affiliations

School of Information Science and Engineering, Yanshan University, Qinhuangdao, 066004, Hebei, China
Zhongping Zhang, Jiaojiao Liu & Chuangye Miao
The Key Laboratory for Computer Virtual Technology and System Integration of Hebei Province, Qinhuangdao, 066004, Hebei, China
Zhongping Zhang

Authors

Zhongping Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jiaojiao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Chuangye Miao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhongping Zhang .

Editor information

Editors and Affiliations

School of Information Science and Technology, Jiujiang University, Jiujiang, China
Hu Peng
School of Information Science and Technology, Jiujiang University, Jiujiang, China
Changshou Deng
School of Computer, Wuhan University, Wuhan, China
Zhijian Wu
School of Computer Science and Engineering, The University of Aizu, Aizu-Wakamatsu, Fukushima, Japan
Yong Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Z., Liu, J., Miao, C. (2019). Local Outlier Detection Algorithm Based on Gaussian Kernel Density Function. In: Peng, H., Deng, C., Wu, Z., Liu, Y. (eds) Computational Intelligence and Intelligent Systems. ISICA 2018. Communications in Computer and Information Science, vol 986. Springer, Singapore. https://doi.org/10.1007/978-981-13-6473-0_29

Download citation

DOI: https://doi.org/10.1007/978-981-13-6473-0_29
Published: 08 February 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-6472-3
Online ISBN: 978-981-13-6473-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics