Skip to main content

Enhanced KNNC Using Train Sample Clustering

  • Conference paper
  • First Online:
Book cover Engineering Applications of Neural Networks (EANN 2015)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 517))

  • 1085 Accesses

Abstract

In this paper, a new classification method based on k-Nearest Neighbor (kNN) lazy classifier is proposed. This method leverages the clustering concept to reduce the size of the training set in kNN classifier and also in order to enhance its performance in terms of time complexity. The new approach is called Modified Nearest Neighbor Classifier Based on Clustering (MNNCBC). Inspiring the traditional lazy k-NN algorithm, the main idea is to classify a test instance based on the tags of its k nearest neighbors. In MNNCBC, the training set is first grouped into a small number of partitions. By obtaining a number of partitions employing several runnings of a simple clustering algorithm, MNNCBC algorithm extracts a large number of clusters out of those partitions. Then, a label is assigned to the center of each cluster produced in the previous step. The assignment is determined with use of the majority vote mechanism between the class labels of the patterns in each cluster. MNNCBC algorithm iteratively inserts a cluster into a pool of the selected clusters that are considered as the training set of the final 1-NN classifier as long as the accuracy of 1-NN classifier over a set of patterns included the training set and the validation set improves. The selected set of the most accurate clusters are considered as the training set of proposed 1-NN classifier. After that, the class label of a new test sample is determined according to the class label of the nearest cluster center. While kNN lazy classifier is computationally expensive, MNNCBC classifier reduces its computational complexity by a multiplier of 1/k. So MNNCBC classifier is about k times faster than kNN classifier. MNNCBC is evaluated on some real datasets from UCI repository. Empirical results show that MNNCBC has an excellent improvement in terms of both accuracy and time complexity in comparison with kNN classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fix, E., Hodges, J.L.: Discriminatory analysis, nonparametric discrimination: Consistency properties. Technical Report 4, USAF School of Aviation Medicine, Randolph Field, Texas (1951)

    Google Scholar 

  2. Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Inform. Theory, IT 13, 21–27 (1967)

    Article  MATH  Google Scholar 

  3. Hellman, M.E.: The nearest neighbor classification rule with a reject option. IEEE Trans. Syst. Man Cybern. 3, 179–185 (1970)

    MATH  Google Scholar 

  4. Fukunaga, K., Hostetler, L.: k-nearest-neighbor bayes-risk estimation. IEEE Trans. Information Theory 21(3), 285–293 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  5. Dudani, S.A.: The distance-weighted k-nearest-neighbor rule. IEEE Trans. Syst. Man Cybern., SMC 6, 325–327 (1976)

    Article  Google Scholar 

  6. Bailey, T., Jain, A.: A note on distance-weighted k-nearest neighbor rules. IEEE Trans. Systems, Man. Cybernetics 8, 311–313 (1978)

    Article  MATH  Google Scholar 

  7. Bermejo, S., Cabestany, J.: Adaptive soft k-nearest-neighbour classifiers. Pattern Recognition 33, 1999–2005 (2000)

    Article  MATH  Google Scholar 

  8. Jozwik, A.: A learning scheme for a fuzzy k-nn rule. Pattern Recognition Letters 1, 287–289 (1983)

    Article  Google Scholar 

  9. Keller, J.M., Gray, M.R., Givens, J.A.: A fuzzy k-nn neighbor algorithm. IEEE Trans. Syst. Man Cybern., SMC 15(4), 580–585 (1985)

    Article  Google Scholar 

  10. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. John Wiley & Sons (2000)

    Google Scholar 

  11. Itqon, S.K., Satoru, I.: Improving Performance of k-Nearest Neighbor Classifier by Test Features. Springer Transactions of the Institute of Electronics, Information and Communication Engineers (2001)

    Google Scholar 

  12. Lam, L., Suen, C.Y.: Application of majority voting to pattern recognition: An analysis of its behavior and performance. IEEE Transactions on Systems, Man, and Cybernetics 27(5), 553–568 (1997)

    Article  Google Scholar 

  13. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)

    MATH  Google Scholar 

  14. Newman, C.B.D.J., Hettich, S., Merz, C.: UCI repository of machine learning databases (1998). http://www.ics.uci.edu/˜mlearn/MLSummary.html

  15. Wu, X.: Top 10 algorithms in data mining. Knowledge information, 22-24. Springer-Verlag London Limited (2007)

    Google Scholar 

  16. Parvin, H., Minaei-Bidgoli, B., Ghatei, S., Alinejad-Rokny, H.: An Innovative Combination of Particle Swarm Optimization, Learning Automaton and Great Deluge Algorithms for Dynamic Environments. International Journal of the Physical Sciences, IJPS 6(22), 5121–5127 (2011)

    Google Scholar 

  17. Parvin, H., Helmi, H., Minaei-Bidgoli, B., Alinejad-Rokny, H., Shirgahi, H.: Linkage Learning Based on Differences in Local Optimums of Building Blocks with One Optima. International Journal of the Physical Sciences, IJPS 6(14), 3419–3425 (2011)

    Google Scholar 

  18. Parvin, H., Alizadeh, H., Minaei-Bidgoli, B.: Validation Based Modified k-Nearest Neighbor. Book Chapter in IAENG Transactions on Engineering Technologies, II–Special Edition of the World Congress on Engineering and Computer Science (2008)

    Google Scholar 

  19. McInerney, D.O., Nieuwenhuis, M.B.: A comparative analysis of kNN and decision tree methods for the Irish National Forest Inventory. International Journal of Remote Sensing 30(19), 4937–4955 (2009)

    Article  Google Scholar 

  20. Su, M.Y.: Using clustering to improve the kNN-based classifiers for online anomaly network traffic identification. Journal of Network and Computer Applications 34(2), 722–730 (2010)

    Article  Google Scholar 

  21. Bi, Y., Bell, D., Wang, H., Guo, G., Guan, J.: Combining multiple classifiers using dempster’s rule text caractrization. Applied Artificial Intelligence: An International Journal 21(3), 211–239 (2007)

    Article  Google Scholar 

  22. Tan, S.: An effective refinement strategy for KNN text classifier. Expert Systems with Applications 30(2), 290–298 (2005)

    Article  MathSciNet  Google Scholar 

  23. Yan, W.Y., Shaker, A.: The effects of combining classifiers with the same training statistics using Bayesian decision rules. International Journal of Remote Sensing 32(13), 3729–3745 (2011)

    Article  Google Scholar 

  24. Gao, Y., Gao, F.: Edited AdaBoost by weighted kNN. Neurocomputing 73(16–18), 3079–3088 (2010)

    Article  Google Scholar 

  25. Liao, Y., Vemuri, V.R.: Use of K-Nearest Neighbor classifier for intrusion detection. Computers & Security 21(5), 439–448 (2002)

    Article  Google Scholar 

  26. Chikh, M.A., Saidi, M., Settouti, N.: Diagnosis of Diabetes Diseases Using an Artificial Immune Recognition System2 (AIRS2) with Fuzzy K-nearest Neighbor. Journal of Medical Systems (2011) (Online)

    Google Scholar 

  27. Liu, D.Y., Chen, H.L., Yang, B., Lv, X.E., Li, L.N., Liu, J.: Design of an Enhanced Fuzzy k-nearest Neighbor Classifier Based Computer Aided Diagnostic System for Thyroid Disease. Journal of Medical Systems (2011) (Online)

    Google Scholar 

  28. Arif, M., Malagore, I.A., Afsar, F.A.: Detection and Localization of Myocardial Infarction using K-nearest Neighbor Classifier. Journal of Medical Systems 36(1), 279–289 (2012)

    Article  Google Scholar 

  29. Mejdoub, M., Amar, C.B.: Classification improvement of local feature vectors over the KNN algorithm. Multimedia Tools and Applications (2011) (Online)

    Google Scholar 

  30. Qodmanan, H.R., Nasiri, M., Minaei-Bidgoli, B.: Multi objective association rule mining with genetic algorithm without specifying minimum support and minimum confidence. Expert Systems with Applications 38(1), 288–298 (2011)

    Article  Google Scholar 

  31. Parvin, H., Minaei-Bidgoli, B., Alizadeh, H.: Detection of cancer patients using an innovative method for learning at imbalanced datasets. In: Yao, J., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS, vol. 6954, pp. 376–381. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  32. Daryabari, M., Minaei-Bidgoli, B., Parvin, H.: Localizing program logical errors using extraction of knowledge from invariants. In: Pardalos, P.M., Rebennack, S. (eds.) SEA 2011. LNCS, vol. 6630, pp. 124–135. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  33. Parvin, H., Minaei-Bidgoli, B.: Linkage learning based on local optima. In: Jędrzejowicz, P., Nguyen, N.T., Hoang, K. (eds.) ICCCI 2011, Part I. LNCS, vol. 6922, pp. 163–172. Springer, Heidelberg (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hamid Parvin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Parvin, H., Zolfaghari, A., Rad, F. (2015). Enhanced KNNC Using Train Sample Clustering. In: Iliadis, L., Jayne, C. (eds) Engineering Applications of Neural Networks. EANN 2015. Communications in Computer and Information Science, vol 517. Springer, Cham. https://doi.org/10.1007/978-3-319-23983-5_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23983-5_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23981-1

  • Online ISBN: 978-3-319-23983-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics