In order to build an efficient nearest neighbor classifier three objectives have to be reached: achieve a high accuracy rate, minimize the set of prototypes to make the classifier tractable even with large databases, and finally, reduce the set of features used to describe the prototypes. Irrelevant or redundant features are likely to contribute more noise than useful information. These objectives are not independent. This chapter investigates a method based on a hybrid genetic algorithm combined with a local optimization procedure. Some concepts are introduced to promote both diversity and elitism in the genetic population. The prototype selection aims to remove noisy and superfluous prototypes and selects among the others only the most critical ones. Moreover, the better the selection the faster the algorithm. The interest of the method is demonstrated with synthetic and real chemometric data, involving a large number of features. The performances are compared to those obtained with well known algorithms. Key words: Feature selection, Genetic algorithm, Hybrid algorithm, classification, k nearest neighbors
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Guyon I and Elisseeff A (2003) An introduction to variable and descriptor selection. Journal of Machine Learning Research 3: 1157-1182
Piramuthu S (2004) Evaluating feature selection methods for learning in data mining application. European Journal of Operational Research 156: 483-494
Dash M and Liu H (1997) Feature selection for classification. Intelligent Data Analysis 1: 131-156
Dash M, Choi K, Scheuermann P, and Lui H (2002) Feature selection for clustering-a filter solution. In Proceedings of the second International Conference on Data Mining 115-122
Kohavi R and John G (1997) Wrappers for feature subset selection. Artificial Intelligence 97(1-2): 273-324
Dasarathy B V, Sanchez J S, and Townsend S (2003) Nearest neighbor editing and con-densing tools-synergy exploitation. Pattern Analysis & Applications 3: 19-30
Francesco J F, Jesus V, and Vidal A (1999) Considerations about sample-size sensitivity of a family of edited nearest-neighbor rules. IEEE Trans. On Syst. Man and Cyber. 29(4) Part B
Aha D, Kibler D, and Albert M K (1991) Instance-based learning algorithms. Machine Learning 6: 37-66
Brighton H and Mellish C (2002) Advances in instance selection for instance-based learning algorithms. Data Mining and Knowledge Discovery 6: 153-172
Hart P E (1968) The condensed nearest neighbor rule. IEEE Trans. Inf. Theory 16: 515-516
Gates G W (1972) The reduced nearest neighbor rule. IEEE Trans. Inf. Theory 18(3): 431-433
Swonger C W (1972) Sample set condensation for a condensed nearest neighbour deci-sion rule for pattern recognition. Watanabe S (Ed.), Academic, Orlando, FA 511-519
Bezdek J C and Kuncheva L I (2000) Nearest prototype classifier designs: an experimen-tal study. International Journal of Intelligent Systems 16(12): 1445-1473
Wilson D R and Martinez T R (2000) Reduction techniques for instance-based learning algorithms. Machine Learning 38(3): 257-286
Eiben A E, Hinterding R, and Michalewicz Z (1999) Parameter control in evolutionary algorithms. IEEE Trans. Evol. Comput. 3(2): 124-141
Shisanu T and Prabhas C (2002) Parallel genetic algorithm with parameter adaptation Information Processing Letters, (82): 47-54
Wen-Yang L, Tzung-Pei H, and Shu-Min L (2004) On adaptating migration parameters for multi-population genetic algorithms. IEEE Int. Conf. Syst. Man and Cyber. 6: 5731-5735
Lozano M, Herrera F, and Cano J R (2007) Replacement strategies to preserve useful diversity in steady-state genetic algorithms. Information Science
Martinez-Estudillo A, Hervas-Martinez C, Martinez-Estudillo F, and Garcia-Pedrajas N. (2006) Hybrid method based on clustering for evolutionary algorithms with local search. IEEE Trans. Syst. Man and Cyber. Part B 36(3): 534-545
Hart W E (1994) Adaptative global optimization with local search. PhD Thesis, University of California, San Diego
Land M W S (1998) Evolutionary algorithms with local search for combinatorial opti-mization. PhD Thesis, University of California, San Diego
Knowles J D (2002) Local Search and hybrid evolutionary algorithms for pareto opti-mization. PhD Thesis, University of Reading, UK
Carlos A. Coello Coello (1999) A Comprehensive Survey of evolutionary-based multi-objective optimization techniques. Knowledge and Information Systems 1(3): 129-156
Moscato P (1999) Memetic algorithms: A short introduction. Corne D, Glover F, and Dorigo M (eds.), New Ideas in Optimization, McGraw-Hill, Maidenhead, 219-234
Merz P (2000) Memetic algorithms for combinatorial optimization problems: fitness landscapes and effective search strategies. PhD thesis, University of Siegen
Krasnogor N and Smith J (2002) A memetic algorithm with self-adaptive local search: TSP as a case study. Proceedings of the 2002 ACM symposium on Applied computing, Madrid, Spain, 178-183
Krasnogor N (2002) Studies on the theory and design space of memetic algorithms. Thesis University of the West of England, Bristol
Tamaki H, Mori M, Araki M, and Ogai H (1995) Multicriteria optimization by genetic algorithms: a case of scheduling in hot rolling process. Proceedings of the third APORS 374-381
Zhang H and Sun G (2002) Optimal reference subset selection for nearest neighbor classification by tabu search. Pattern Recognition 35: 14811-490
Shalak D.B (1994) Prototype and feature selection by sampling and random mutation hill climbing algorithms. In Proceedings of the Eleventh International Conference on Machine Learning, Morgan Kaufman, New Brunswick 293-301
Kuncheva L I and Jain L C (1999) Nearest neighbor classifier: simultaneous editing and descriptor selection. Pattern Recognition Letters 20 (11-13): 1149-1156
Ho S -H, Lui C -C, and Liu S (2002) Design of an optimal nearest neighbor classifier using an intelligent genetic algorithm. Pattern Recognition Letters 23: 1495-1503
Chen J H, Chen H M, and Ho S Y (2005) Design of nearest neighbor classifiers: multi-objective approach. International Journal of Approximate Reasoning 40 (1-2): 3-22
Zitzler E (2002) Evolutionary algorithms for multiobjective optimization. In Evolution-ary methods for design, optimization and Control. Giannakoglou K, Tsahalis D, Periaux J, Papailiou K, and Forgaty T (Eds.), CIMNE, Barcelona, Spain
Wiese K and Goodwin S D (1998) Keep-best reproduction: a selection strategy for ge-netic algorithms. Proceedings of the 1998 symposium on applied computing 343-348
Matsui K (1999) New selection method to improve the population diversity in genetic algorithms. Syst. Man and Cyber. IEEE Int. Conf. 1: 625-630
Blake C, Keogh E, and Merz C J (1998) UCI repository of machine learning databases [http://www.ics.uci.edi/ mlearn/MLRepository.html]. Department of Information and Computer Science University of California, Irvine, CA
Mahfoud S W (1992) Crowding and preselection revisited. Second Conference on Paral-lel problem Solving from Nature (PPSN’92), Brussels, Belgium 2: 27-36
Harik G (1995) Finding multimodal solutions using restricted tournament selection. Pro-ceedings of the sixth International Conferernce on Genetic Algorithms, Eshelman L J (Ed.) Morgan Kaufman, San Mateo, CA 24-31
Miller B L and Shaw M J (1996) Genetic algorithms with dynamic sharing for multi-modal function optimization. Proceedings of the International Conference on Evolution-ary Computation, Piscataway 786-791
Cano J R, Herrera F, and Lozano M (2003) Using evolutionary algorithms as instance selection for data reduction in kdd: an experimental study. IEEE Trans. Evol. Comput. 7(6): 193-208
Kim S W and Oommen B J (2003) Enhancing prototype reduction schemes with recursion: a method applicable for large data sets. IEEE Trans. Syst. Man and Cyber. Part B 34(3): 1384-1397
Sanchez J S (2004) High training set size reduction by space partitioning and prototype abstraction. Pattern Recognition 37: 1561-1564
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Frédéric, R., Serge, G. (2007). An Efficient Nearest Neighbor Classifier. In: Abraham, A., Grosan, C., Ishibuchi, H. (eds) Hybrid Evolutionary Algorithms. Studies in Computational Intelligence, vol 75. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73297-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-73297-6_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73296-9
Online ISBN: 978-3-540-73297-6
eBook Packages: EngineeringEngineering (R0)