Finding Small Consistent Subset for the Nearest Neighbor Classifier Based on Support Graphs

  • Milton García-Borroto
  • Yenny Villuendas-Rey
  • Jesús Ariel Carrasco-Ochoa
  • José Fco. Martínez-Trinidad
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5856)


Finding a minimal subset of objects that correctly classify the training set for the nearest neighbors classifier has been an active research area in Pattern Recognition and Machine Learning communities for decades. Although finding the Minimal Consistent Subset is not feasible in many real applications, several authors have proposed methods to find small consistent subsets. In this paper, we introduce a novel algorithm for this task, based on support graphs. Experiments over a wide range of repository databases show that our algorithm finds consistent subsets with lower cardinality than traditional methods.


nearest neighbor condensing prototype selection minimal consistent subset 


  1. 1.
    Cover, T., Hart, P.E.: Nearest Neighbor pattern classification. IEEE Trans. on Information Theory 13, 21–27 (1967)zbMATHCrossRefGoogle Scholar
  2. 2.
    Athitsos, V.: Learning embeddings for indexing, retrieval, and classification, with applications to object and shape recognition in image databases. Vol. Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy, p. 156. Boston University (2006)Google Scholar
  3. 3.
    Wilfong, G.: Nearest neighbor problems. In: 7th Annual ACM Symposium on Computational Geometry, pp. 224–233 (1991)Google Scholar
  4. 4.
    Hart, P.E.: The condensed nearest neighbor rule. IEEE Trans. on Information Theory 14, 515–516 (1968)CrossRefGoogle Scholar
  5. 5.
    Gates, G.W.: The reduced nearest neighbor rule. IEEE Transactions on Information Theory IT-18, 431–433 (1972)CrossRefGoogle Scholar
  6. 6.
    Dasarathy, B.D.: Minimal consistent set (MCS) identification for optimal nearest neighbor decision systems design. IEEE Transactions on Systems, Man and Cybernetics 24, 511–517 (1994)CrossRefGoogle Scholar
  7. 7.
    Chou, C.-H., Kuo, B.-H., Chang, F.: The Generalized Condensed Nearest Neighbor Rule as a Data Reduction Method. In: 18th International Conference on Pattern Recognition ICPR 2006, Tampa, USA. IEEE, Los Alamitos (2006)Google Scholar
  8. 8.
    García-Borroto, M., Ruiz-Shulcloper, J.: Selecting Prototypes in Mixed Incomplete Data. In: Sanfeliu, A., Cortés, M.L. (eds.) CIARP 2005. LNCS, vol. 3773, pp. 450–459. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  9. 9.
    Merz, C.J., Murphy, P.M.: UCI Repository of Machine Learning Databases. University of California at Irvine, Department of Information and Computer Science, Irvine (1998)Google Scholar
  10. 10.
    Wilson, R.D., Martinez, T.R.: Improved Heterogeneous Distance Functions. Journal of Artificial Intelligence Research 6, 1–34 (1997)zbMATHMathSciNetGoogle Scholar
  11. 11.
    Kuncheva, L.I.: Combining pattern classifiers: methods and algorithms. Wiley-Interscience, Hoboken (2004)zbMATHCrossRefGoogle Scholar
  12. 12.
    Pudil, P., Novovicova, F.J., Kittler, J.: Floating search methods in feature selection. Pattern Recognit. Lett. 15, 1119–1125 (1993)CrossRefGoogle Scholar
  13. 13.
    Dietterich, T.G.: Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms, vol. 10, pp. 1895–1923. MIT Press, Cambridge (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Milton García-Borroto
    • 1
    • 3
  • Yenny Villuendas-Rey
    • 2
  • Jesús Ariel Carrasco-Ochoa
    • 3
  • José Fco. Martínez-Trinidad
    • 3
  1. 1.Bioplantas CenterUNICAC. de ÁvilaCuba
  2. 2.Ciego de Ávila University UNICAC. de ÁvilaCuba
  3. 3.National Institute of Astrophysics, Optics and ElectronicsPueblaMéxico

Personalised recommendations