Skip to main content
Log in

Optimal selection of reference set for the nearest neighbor classification by Tabu search

  • Correspondence
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

In this paper, a new approach is presented to find the referece set for the nearest neighbor classifier. The optimal reference set, which has minimum sample size and satisfies a certain error rate threshold, is obtained through a Tabu search algorithm. When the error rate threshold is set to zero, the algorithm obtains a near minimal consistent subset of a given training set. While the threshold is set to a small appropriate value, the obtained reference set may compensate the bias of the nearest neighbor estimate. An aspiration criterion for Tabu search is introduced, which aims to prevent the search process from the inefficient wandering between the feasible and infeasible regions in the search space and speed up the convergence. Experimental results based on a number of typical data sets are presented and analyzed to illustrate the benefits of the proposed method. Compared to conventional methods, such as CNN and Dasarathy’s algorithm, the size of the reduced reference sets is much smaller, and the nearest neighbor classification performance is better, especially when the error rate thresholds are set to appropriate nonzero values. The experimental results also illustrate that the MCS (minimal consistent set) of Dasarathy’s algorithm is not minimal, and its candidate consistent set is not always ensured to reduce monotonically. A counter example is also given to confirm this claim.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Dasarathy B V. Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques. Los Alamitos, CA: IEEE Computer Society Press, 1991.

    Google Scholar 

  2. Hart P E. The condensed nearest neighbor rule.IEEE Trans. Information Theory, May 1968, IT-14(3): 515–516.

    Article  Google Scholar 

  3. Gates G W. The reduced nearest neighbor rule.IEEE Trans. Information Theory, May 1972, IT-18(3): 431–433.

    Article  Google Scholar 

  4. Swonger C W. Sample set condensation for a condensed nearest neighbor decision rule for pattern recognition. In Frontiers of Pattern Recognition, Watanabe S (ed.), New York: Academic Press, 1972, pp.511–519.

    Google Scholar 

  5. Chang C L. Finding prototypes for nearest neighbor classifiers.IEEE Trans. Computers, Nov. 1974, C-23(11): 1179–1184.

    Article  MATH  Google Scholar 

  6. Devijver P A, Kittler J. On the edited nearest neighbor rule. InProc. 5th Int. Conf. Pattern Recognition, Miami, Florida, 1980, pp.72–80.

  7. Dasarathy B V. Minimal consistent set (MCS) identification for optimal nearest neighbor decision systems design.IEEE Trans. Syst. Man Cybern., March 1994, 24(3): 511–517.

    Article  Google Scholar 

  8. Kuncheva L I. Fitness functions in editingk-NN reference set by genetic algorithms.Pattern Recognition, 1997, 30(6): 1041–1049.

    Article  Google Scholar 

  9. Glover F, Laguna M. Tabu Search, in Modern Heuristic Techniques for Combinatorial Problems. Reeves R C (ed.), Berkshire: McGraw-Hill, 1995, pp.70–150.

    Google Scholar 

  10. Fukunaga K. Introduction to Statistical Pattern Recognition. Second Edition, New York: Academic Press, 1990.

    MATH  Google Scholar 

  11. Hamamoto Y, Uchimura S, Tomita S. A bootstrap technique for nearest neighbor classifier design.IEEE Trans. Pattern Analysis and Machine Intelligence, Jan. 1997, 19(1): 73–79.

    Article  Google Scholar 

  12. Van Ness J. On the dominance of non-parametric Bayes rule discriminant algorithms in high dimensions.Pattern Recognition, 1980, 12(3): 355–368.

    Article  Google Scholar 

  13. Fukunaga K, Hummels D M. Bias on nearest neighbor error estimates.IEEE Trans. Pattern Analysis and Machine Inteiligence, Jan. 1987, 9(1): 103–112.

    MATH  Google Scholar 

  14. Fukunaga K, Hummels D M. Bayes error estimation using Parzen andk-NN procedures.IEEE Trans. Pattern Analysis and Machine Intelligence, 1987, 9(5): 634–643.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhang Hongbin.

Additional information

Supported by the National Natural Science Foundation of China (No.69675007) and Beijing Municipal Natural Science Foundation (No.4972008).

ZHANG Hongbin received the B.S. degree in automation in 1968, and the M.S. degree in pattern recognition and intelligent system in 1981, both from Tsinghua University, China. From 1986 to 1989 he was an invited researcher in Department of Information Science of Kyoto University, Japan. From 1993 to 1994 he was a visiting scholar of Rensselaer Polytechnic Institute, USA. Since 1993, he has been a professor of Computer Institute, Beijing Polytechnic University, China. His current research interests include pattern recognition, computer vision, neural networks and image processing.

SUN Guangyu received the B.S. degree in geology from Peking University in 1992 and the M.S. degree from Computer Institute, Beijing Polytechnic University in 1999. His current research interests include pattern recognition and computer vision.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, H., Sun, G. Optimal selection of reference set for the nearest neighbor classification by Tabu search. J. Comput. Sci. & Technol. 16, 126–136 (2001). https://doi.org/10.1007/BF02950417

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02950417

Keywords

Navigation