An Improved Outlier Detection Algorithm Based on Reverse K-Nearest Neighbors of Adaptive Parameters

Fangfang, Xie; Liancheng, Xu; Xuezhi, Chi; Zhenfang, Zhu

doi:10.1007/978-94-007-7618-0_47

Xie Fangfang^5,6,
Xu Liancheng^5,6,
Chi Xuezhi⁷ &
…
Zhu Zhenfang⁸

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 269))

154 Accesses

Abstract

The outlier detection algorithm based on reverse k-nearest neighbors can detect isolated points. The time complexity of finding the k-nearest neighbor is O(kN ²), which is not suitable for large data set, and the selection of the parameters k have a great impact on getting the outliers in large data set. This paper used an adaptive method to determine the parameters k, and proposed an efficient pruning method by the triangle inequality, which reduced the computation in detecting outliers. The theoretical analysis and experimental results demonstrated the feasibility and efficiency of the algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 429.00; Price excludes VAT (USA)

Softcover Book: USD 549.99; Price excludes VAT (USA)

Hardcover Book: USD 549.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Han J, Kamber M. (2011) Data mining concepts and techniques. Morgan kaufmann Machinery industry press, p 295
Google Scholar
Wu M, Jermaine C (2006) Outlier detection by sampling with accuracy guarantees. In: Proceedings of the 12th ACM SIGkDD international conference on knowledge discovery and data mining. ACM, Philadelphia, pp 767–772
Google Scholar
Gu H, Rastogi R, SHIM K (1998) Cure: an efficient clustering algorithm for large databases In: Proceedings of the 1998 ACN SIGMOD international conference on management of data montreal. ACM, pp 73–84
Google Scholar
Herman CA (1952) Measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann Math Stat 23(4):493–507
Google Scholar
Saha BN (2009) Ray N, Zhang H. snake validation: a PCA-based outlier detection method. IEEE Signal Process Lett 16(6):549–552
Article Google Scholar
Jie H, Gongde G (2009) Distributed intrusion detection architecture based on incremental kNN model. Microcomput Appl 30(11):29–32
Google Scholar
Korn F, Muthukrishna S (2000) Influence sets based on reverse nearest neighbors queries. In: Proceedings of ACM, SIGMOD, pp 201–212
Google Scholar
Chenyi X, Hsu W, Lee ML, et al. (2006) BODER: efficient computation of boundary points. IEEE Trans knowl Data Eng, 18
Google Scholar
Sheng L, Shimin L. (2004) Distance-based outlier detection research. Computer Eng Appl 40 (33):73–75
Google Scholar
Yue F, Baozhi Q (2007) The outlier detection algorithm based on reverse k neighbor. Comput Eng Appl, lancet (7):182–184
Google Scholar
ShengZong L, XiaoPing F (2012) Applies to connection properties outlier test samples of the adaptive parameters. Appl res Comput 29(9):3259–3262
Google Scholar
Bhaduri K, Matthews BL, Giannella CR (2011) Algorithms for speeding up distance-based outlier detection In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining.[S.I SCM Press, London
Google Scholar
Sambasivam S, Theodsopoulos N (2006) Advanced data clustering methods of mining Web documents. Issues Informing Sci Infor Technol 3:563–579
Google Scholar

Download references

Acknowledgments

This work is supported partly by National Nature Science Foundation of China (60873247), Science and Technology Plan in Colleges and Universities of Shandong Province (J12LN21).

Author information

Authors and Affiliations

School of Information Science & Engineering, Shandong Normal University, Jinan, 250014, China
Xie Fangfang & Xu Liancheng
Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan, 250014, China
Xie Fangfang & Xu Liancheng
Shandong Poice College, Jinan, 250014, China
Chi Xuezhi
School of Information Science and Electric Engineering, Shandong Jiaotong University, Jinan, 250357, China
Zhu Zhenfang

Authors

Xie Fangfang
View author publications
You can also search for this author in PubMed Google Scholar
Xu Liancheng
View author publications
You can also search for this author in PubMed Google Scholar
Chi Xuezhi
View author publications
You can also search for this author in PubMed Google Scholar
Zhu Zhenfang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xu Liancheng .

Editor information

Editors and Affiliations

Cognitive Science, Xiamen University, Xiamen, People's Republic of China
Shaozi Li
Human Informatics and Cognitive Sciences, Waseda University Networked Information Systems Lab, Waseda, Japan
Qun Jin
School of Systems Information Science, Future University Hakodate, Hakodate, Hokkaido, Japan
Xiaohong Jiang
Department of Computer Science and Engineering, Seoul University of Science & and Technology (SeoulTech), Seoul, Korea, Republic of (South Korea)
James J. (Jong Hyuk) Park

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fangfang, X., Liancheng, X., Xuezhi, C., Zhenfang, Z. (2014). An Improved Outlier Detection Algorithm Based on Reverse K-Nearest Neighbors of Adaptive Parameters. In: Li, S., Jin, Q., Jiang, X., Park, J. (eds) Frontier and Future Development of Information Technology in Medicine and Education. Lecture Notes in Electrical Engineering, vol 269. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-7618-0_47

Download citation

DOI: https://doi.org/10.1007/978-94-007-7618-0_47
Published: 06 December 2013
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-7617-3
Online ISBN: 978-94-007-7618-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics