Abstract
The outlier detection algorithm based on reverse k-nearest neighbors can detect isolated points. The time complexity of finding the k-nearest neighbor is O(kN 2), which is not suitable for large data set, and the selection of the parameters k have a great impact on getting the outliers in large data set. This paper used an adaptive method to determine the parameters k, and proposed an efficient pruning method by the triangle inequality, which reduced the computation in detecting outliers. The theoretical analysis and experimental results demonstrated the feasibility and efficiency of the algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Han J, Kamber M. (2011) Data mining concepts and techniques. Morgan kaufmann Machinery industry press, p 295
Wu M, Jermaine C (2006) Outlier detection by sampling with accuracy guarantees. In: Proceedings of the 12th ACM SIGkDD international conference on knowledge discovery and data mining. ACM, Philadelphia, pp 767–772
Gu H, Rastogi R, SHIM K (1998) Cure: an efficient clustering algorithm for large databases In: Proceedings of the 1998 ACN SIGMOD international conference on management of data montreal. ACM, pp 73–84
Herman CA (1952) Measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann Math Stat 23(4):493–507
Saha BN (2009) Ray N, Zhang H. snake validation: a PCA-based outlier detection method. IEEE Signal Process Lett 16(6):549–552
Jie H, Gongde G (2009) Distributed intrusion detection architecture based on incremental kNN model. Microcomput Appl 30(11):29–32
Korn F, Muthukrishna S (2000) Influence sets based on reverse nearest neighbors queries. In: Proceedings of ACM, SIGMOD, pp 201–212
Chenyi X, Hsu W, Lee ML, et al. (2006) BODER: efficient computation of boundary points. IEEE Trans knowl Data Eng, 18
Sheng L, Shimin L. (2004) Distance-based outlier detection research. Computer Eng Appl 40 (33):73–75
Yue F, Baozhi Q (2007) The outlier detection algorithm based on reverse k neighbor. Comput Eng Appl, lancet (7):182–184
ShengZong L, XiaoPing F (2012) Applies to connection properties outlier test samples of the adaptive parameters. Appl res Comput 29(9):3259–3262
Bhaduri K, Matthews BL, Giannella CR (2011) Algorithms for speeding up distance-based outlier detection In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining.[S.I SCM Press, London
Sambasivam S, Theodsopoulos N (2006) Advanced data clustering methods of mining Web documents. Issues Informing Sci Infor Technol 3:563–579
Acknowledgments
This work is supported partly by National Nature Science Foundation of China (60873247), Science and Technology Plan in Colleges and Universities of Shandong Province (J12LN21).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media Dordrecht
About this paper
Cite this paper
Fangfang, X., Liancheng, X., Xuezhi, C., Zhenfang, Z. (2014). An Improved Outlier Detection Algorithm Based on Reverse K-Nearest Neighbors of Adaptive Parameters. In: Li, S., Jin, Q., Jiang, X., Park, J. (eds) Frontier and Future Development of Information Technology in Medicine and Education. Lecture Notes in Electrical Engineering, vol 269. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-7618-0_47
Download citation
DOI: https://doi.org/10.1007/978-94-007-7618-0_47
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-7617-3
Online ISBN: 978-94-007-7618-0
eBook Packages: EngineeringEngineering (R0)