Data Characterization for Effective Prototype Selection

Mollineda, Ramón A.; Sánchez, J. Salvador; Sotoca, José M.

doi:10.1007/11492542_4

Ramón A. Mollineda¹⁹,
J. Salvador Sánchez¹⁹ &
José M. Sotoca¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 3523))

Included in the following conference series:

Iberian Conference on Pattern Recognition and Image Analysis

1623 Accesses
37 Citations

Abstract

The Nearest Neighbor classifier is one of the most popular supervised classification methods. It is very simple, intuitive and accurate in a great variety of real-world applications. Despite its simplicity and effectiveness, practical use of this rule has been historically limited due to its high storage requirements and the computational costs involved, as well as the presence of outliers. In order to overcome these drawbacks, it is possible to employ a suitable prototype selection scheme, as a way of storage and computing time reduction and it usually provides some increase in classification accuracy. Nevertheless, in some practical cases prototype selection may even produce a degradation of the classifier effectiveness. From an empirical point of view, it is still difficult to know a priori when this method will provide an appropriate behavior. The present paper tries to predict how appropriate a prototype selection algorithm will result when applied to a particular problem, by characterizing data with a set of complexity measures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chang, C.-L.: Finding prototypes for nearest neighbor classifiers. IEEE Trans. on Computers 23, 1179–1184 (1974)
Article MATH Google Scholar
Chavez, E., Navarro, G., Baeza-Yates, R.A., Marroquin, J.L.: Searching in metric spaces. ACM Computing Surveys 33, 273–321 (2001)
Article Google Scholar
Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. on Information Theory 13, 21–27 (1967)
Article MATH Google Scholar
Dasarathy, B.V.: Minimal consistent subset (MCS) identification for optimal nearest neighbor decision systems design. IEEE Trans. on Systems, Man, and Cybernetics 24, 511–517 (1994)
Article Google Scholar
Devijver, P.A., Kittler, J.: Pattern Recognition: A Statistical Approach. Prentice Hall, Englewood Cliffs (1982)
MATH Google Scholar
Hart, P.E.: The condensed nearest neighbor rule. IEEE Trans. on Information Theory 14, 515–516 (1968)
Article Google Scholar
Ho, T.-K., Basu, M.: Complexity measures of supervised classification problems. IEEE Trans. on Pattern Analysis and Machine Intelligence 24, 289–300 (2002)
Article Google Scholar
Bernardo, E., Ho, T.-K.: On classifier domain of competence. In: Proc. 17th. Int. Conf. on Pattern Recognition 1, Cambridge, UK, pp. 136–139 (2004)
Google Scholar
Kim, S.-W., Oommen, B.J.: Enhancing prototype reduction schemes with LVQ3-type algorithms. Pattern Recognition 36, 1083–1093 (2003)
Article MATH Google Scholar
Kuncheva, L.I.: Editing for the k-nearest neighbors rule by a genetic algorithm. Pattern Recognition Letters 16, 809–814 (1995)
Article Google Scholar
Mollineda, R.A., Ferri, F.J., Vidal, E.: An efficient prototype merging strategy for the condensed 1-NN rule through class-conditional hierarchical clustering. Pattern Recognition 35, 2771–2782 (2002)
Article MATH Google Scholar
Ritter, G.L., Woodruff, H.B., Lowry, S.R., Isenhour, T.L.: An algorithm for a selective nearest neighbour decision rule. IEEE Trans. on Information Theory 21, 665–669 (1975)
Article MATH Google Scholar
Tomek, I.: An experiment with the edited nearest neighbor rule. IEEE Trans. on Systems, Man and Cybernetics 6, 448–452 (1976)
Article MATH MathSciNet Google Scholar
Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data sets. IEEE Trans. on Systems, Man and Cybernetics 2, 408–421 (1972)
Article MATH Google Scholar
Wilson, D.R., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Machine Learning 38, 257–286 (2000)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Dept. Llenguatges i Sistemes Informàtics, Universitat Jaume I, Av. Sos Baynat s/n, E-12071, Castelló de la Plana, Spain
Ramón A. Mollineda, J. Salvador Sánchez & José M. Sotoca

Authors

Ramón A. Mollineda
View author publications
You can also search for this author in PubMed Google Scholar
J. Salvador Sánchez
View author publications
You can also search for this author in PubMed Google Scholar
José M. Sotoca
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Instituto Superior Técnico & Instituto de Sistemas e Robótica,, 1049-001, Lisboa, Portugal
Jorge S. Marques
ETSI Informática y e Telecomunicación, University of Granada, 18071, Granada, Spain
Nicolás Pérez de la Blanca
Instituto Superior Técnico, CERENA-Centro de Recursos Naturais e Ambiente, Av. Rovisco Pais, 1049-001, Lisboa, Portugal
Pedro Pina

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mollineda, R.A., Sánchez, J.S., Sotoca, J.M. (2005). Data Characterization for Effective Prototype Selection. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds) Pattern Recognition and Image Analysis. IbPRIA 2005. Lecture Notes in Computer Science, vol 3523. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11492542_4

Download citation

DOI: https://doi.org/10.1007/11492542_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26154-4
Online ISBN: 978-3-540-32238-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics