Nearest Neighbor Method Based on Local Distribution for Classification

Mao, Chengsheng; Hu, Bin; Moore, Philip; Su, Yun; Wang, Manman

doi:10.1007/978-3-319-18038-0_19

Nearest Neighbor Method Based on Local Distribution for Classification

Chengsheng Mao¹⁰,
Bin Hu¹⁰,
Philip Moore¹⁰,
Yun Su¹⁰ &
…
Manman Wang¹⁰

Conference paper
First Online: 01 January 2015

3547 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9077))

Abstract

The k-nearest-neighbor (kNN) algorithm is a simple but effective classification method which predicts the class label of a query sample based on information contained in its neighborhood. Previous versions of kNN usually consider the k nearest neighbors separately by the quantity or distance information. However, the quantity and the isolated distance information may be insufficient for effective classification decision. This paper investigates the kNN method from a perspective of local distribution based on which we propose an improved implementation of kNN. The proposed method performs the classification task by assigning the query sample to the class with the maximum posterior probability which is estimated from the local distribution based on the Bayesian rule. Experiments have been conducted using 15 benchmark datasets and the reported experimental results demonstrate excellent performance and robustness for the proposed method when compared to other state-of-the-art classifiers.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bache, K., Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011). http://www.csie.ntu.edu.tw/cjlin/libsvm
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13(1), 21–27 (1967)
Article MATH Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification. John Wiley & Sons (2012)
Google Scholar
Dudani, S.: The distance-weighted k-nearest-neighbor rule. IEEE Transactions on Systems, Man and Cybernetics 4, 325–327 (1976)
Article Google Scholar
Duong, T.: ks: Kernel density estimation and kernel discriminant analysis for multivariate data in r. Journal of Statistical Software 21(7), 1–16 (2007)
Google Scholar
Friedman, J., et al.: Flexible metric nearest neighbor classification. Unpublished manuscript available by anonymous FTP from playfair. stanford. edu (see pub/friedman/README) (1994)
Google Scholar
Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., Rubin, D.B.: Bayesian data analysis. CRC Press (2013)
Google Scholar
Govindarajan, M., Chandrasekaran, R.: Evaluation of k-nearest neighbor classifier performance for direct marketing. Expert Systems with Applications 37(1), 253–258 (2010)
Article Google Scholar
Han, J., Kamber, M., Pei, J.: Data mining: concepts and techniques. Morgan kaufmann (2006)
Google Scholar
Hand, D., Mannila, H., Smyth, P.: Principles of data mining. MIT Press (2001)
Google Scholar
Hollander, M., Wolfe, D.A.: Nonparametric statistical methods. John Wiley & Sons, NY (1999)
Google Scholar
Hotta, S., Kiyasu, S., Miyahara, S.: Pattern recognition using average patterns of categorical k-nearest neighbors. In: Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 4, pp. 412–415. IEEE (2004)
Google Scholar
Kononenko, I., Kukar, M.: Machine learning and data mining. Elsevier (2007)
Google Scholar
Lehmann, E.L., Casella, G.: Theory of point estimation, vol. 31. Springer (1998)
Google Scholar
Li, B., Chen, Y., Chen, Y.: The nearest neighbor algorithm of local probability centers. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 38(1), 141–154 (2008)
Article Google Scholar
Magnussen, S., McRoberts, R.E., Tomppo, E.O.: Model-based mean square error estimators for k-nearest neighbour predictions and applications using remotely sensed data for forest inventories. Remote Sensing of Environment 113(3), 476–488 (2009)
Article Google Scholar
Mitani, Y., Hamamoto, Y.: A local mean-based nonparametric classifier. Pattern Recognition Letters 27(10), 1151–1159 (2006)
Article Google Scholar
Reynolds, D.: Gaussian mixture models. In: Encyclopedia of Biometrics, pp. 659–663 (2009)
Google Scholar
Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Philip, S.Y., et al.: Top 10 algorithms in data mining. Knowledge and Information Systems 14(1), 1–37 (2008)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Science and Engineering, Lanzhou University, Lanzhou, China
Chengsheng Mao, Bin Hu, Philip Moore, Yun Su & Manman Wang

Authors

Chengsheng Mao
View author publications
You can also search for this author in PubMed Google Scholar
Bin Hu
View author publications
You can also search for this author in PubMed Google Scholar
Philip Moore
View author publications
You can also search for this author in PubMed Google Scholar
Yun Su
View author publications
You can also search for this author in PubMed Google Scholar
Manman Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bin Hu .

Editor information

Editors and Affiliations

Ho Chi Minh City University of Technology, Ho Chi Minh City, Vietnam
Tru Cao
Singapore Management University, Singapore, Singapore
Ee-Peng Lim
Nanjing University, Nanjing, China
Zhi-Hua Zhou
Japan Advanced Institute of Science and Technology, Nomi City, Japan
Tu-Bao Ho
University of Hong Kong, Hong Kong, Hong Kong SAR
David Cheung
Osaka University, Osaka, Japan
Hiroshi Motoda

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mao, C., Hu, B., Moore, P., Su, Y., Wang, M. (2015). Nearest Neighbor Method Based on Local Distribution for Classification. In: Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D., Motoda, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2015. Lecture Notes in Computer Science(), vol 9077. Springer, Cham. https://doi.org/10.1007/978-3-319-18038-0_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-18038-0_19
Published: 17 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18037-3
Online ISBN: 978-3-319-18038-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics