Abstract
In order to classify an unseen (query) vector q with the k-Nearest Neighbors method (k-NN) one computes a similarity function between q and training vectors in a database. In the basic variant of the k-NN algorithm the predicted class of q is estimated by taking the majority class of the q’s k-nearest neighbors. Various similarity functions may be applied leading to different classification results. In this paper a heterogeneous similarity function is constructed out of different 1-component metrics by minimization of the number of classification errors the system makes on a training set. The HSFL-NN system, which has been introduced in this paper, on five tested datasets has given better results on unseen samples than the plain k-NN method with the optimally selected k parameter and the optimal homogeneous similarity function.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Fix, E., Hodges Jr., J.L.: Discriminatory analysis, nonparametric discrimination consistency properties. Technical Report 4, Randolph Filed, TX: US Air Force, School of Aviation Medicine (1951)
Sebestyen, G.S.: Decision-making process in pattern-recognition. The Macmillan Company, New York (1962)
Cover, T., Hart, P.: Nearest Neighbor Pattern Classification. IEEE Transactions on Information Theory 13, 21–27 (1967)
Cover, T.: Estimation by the nearest neighbor rule. IEEE Transactions on Information Theory 14, 50–55 (1968)
Duda, R., Hart, P.: Pattern Classification and Scene Analysis. John Wiley & Sons, Chichester (1973)
Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Machine Learning 6, 37–66 (1991)
Dasarathy, B.V.: Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques. IEEE Computer Society Press, Los Alamitos (1991)
Cost, S., Salzberg, S.: A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning 10, 57–78 (1993)
Duch, W.: Similarity Based Methods: a general framework for classification, approximation and association. Control and Cybernetics 29(4), 1–30 (2000)
Grudzinski, K.: Similarity Based Methods in Application to Analysis of Scientific and Medical Data, PhD Thesis, Department of Applied Informatics, Nicholaus Copernicus University, Torun, Poland (2002)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., Euler, T.: YALE: Rapid Prototyping for Complex Data Mining Tasks. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2006) (2006)
Knime: Konstanz Information Miner, http://www.knime.org/index.html
SBL, Similarity Based Learner, Software Developed by Karol Grudzinski, Nicholaus Copernicus University: 1997-2002, Kazimierz Wielki University: 2002-2008, University of Economy: 2005-2008s
Stahl, A.: Learning of Knowledge-Intensive Similarity Measures in Case-Based Reasoning. PhD thesis, University of Kaiserslautern, Germany
Mertz, C.J., Murphy, P.M.: UCI repository of machine learning databases, http://www.ics.uci.edu/pub/machine-learning-data-bases
Weiss, S.M., Kulikowski, C.A.: Computer Systems that Learn. Morgan Kaufmann, San Francisco (1991)
Duch, W., Grabczewski, K., Adamczak, R., Grudzinski, K., Hippe, Z.S.: Rules for melanoma skin cancer diagnosis. Komputerowe Systemy Rozpoznawania, KOSYR, Wrocaw 2001, pp. 59–68 (2001)
Hab, Sklodowski, H., Zarzadzania, W.: Spoleczna Wyzsza Szkola Przedsiebiorczosci i Zarzadzania w Lodzi
Nelder, J.A., Mead, R.: A simplex method for function minimization. Computer Journal 7, 308–313 (1965)
Ingber, L.: Adaptive simulated annealing (ASA): Lessons learned. Control and Cybernetics 25(1), 33–54 (1996)
Ortega, J., Koppel, M., Argamon, S.: Arbitrating Among Competing Classifiers Using Learned Referees. Knowledge and Information Systems 3, 470–490 (2001)
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: bagging, boosting and variants. Machine Learning 36, 105–142 (1999)
Duch, W., Grudziński, K.: Meta-Learning: searching in the model space. In: Proceedings of the International Conference on Neural Information Processing, Shanghai, vol. I, pp. 235–240 (2001)
Duch, W., Grudziński, K.: Meta-learning via search combined with parameter optimization. In: Intelligent Information Systems, Sopot, Poland, 2002, Advances in Soft Computing, pp. 13–22. Physica-Verlag (Springer) (2002)
Grudziński, K.: SBL-PM-M: A System for Partial Memory Learning. In: Rutkowski, L., Siekmann, J.H., Tadeusiewicz, R., Zadeh, L.A. (eds.) ICAISC 2004. LNCS (LNAI), vol. 3070, pp. 586–591. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Grudziński, K. (2008). Towards Heterogeneous Similarity Function Learning for the k-Nearest Neighbors Classification. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing – ICAISC 2008. ICAISC 2008. Lecture Notes in Computer Science(), vol 5097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69731-2_56
Download citation
DOI: https://doi.org/10.1007/978-3-540-69731-2_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69572-1
Online ISBN: 978-3-540-69731-2
eBook Packages: Computer ScienceComputer Science (R0)