Ultimate Order Statistics-Based Prototype Reduction Schemes

Thomas, Anu; Oommen, B. John

doi:10.1007/978-3-319-03680-9_42

Anu Thomas²¹ &
B. John Oommen²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8272))

Included in the following conference series:

Australasian Joint Conference on Artificial Intelligence

2586 Accesses
4 Citations

Abstract

The objective of Prototype Reduction Schemes (PRSs) and Border Identification (BI) algorithms is to reduce the number of training vectors, while simultaneously attempting to guarantee that the classifier built on the reduced design set performs as well, or nearly as well, as the classifier built on the original design set. In this paper, we shall push the limit on the field of PRSs to see if we can obtain a classification accuracy comparable to the optimal, by condensing the information in the data set into a single training point. We, indeed, demonstrate that such PRSs exist and are attainable, and show that the design and implementation of such schemes work with the recently-introduced paradigm of Order Statistics (OS)-based classifiers. These classifiers, referred to as Classification by Moments of Order Statistics (CMOS) is essentially anti-Bayesian in its modus operandus. In this paper, we demonstrate the power and potential of CMOS to yield single-element PRSs which are either “selective” or “creative”, where in each case we resort to a non-parametric or a parametric paradigm respectively. We also report a single-feature single-element creative PRS. All of these solutions have been used to achieve classification for real-life data sets from the UCI Machine Learning Repository, where we have followed an approach that is similar to the Naïve-Bayes’ (NB) strategy although it is essentially of an anti-Naïve-Bayes’ paradigm. The amazing facet of this approach is that the training set can be reduced to a single pattern from each of the classes which is, in turn, determined by the CMOS features. It is even more fascinating to see that the scheme can be rendered operational by using the information in a single feature of such a single data point. In each of these cases, the accuracy of the proposed PRS-based approach is very close to the optimal Bayes’ bound and is almost comparable to that of the SVM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Garcia, S., Derrac, J., Cano, J.R., Herrera, F.: Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(3), 417–435 (2012)
Article Google Scholar
http://sci2s.ugr.es/pr/ (April 18, 2013)
Kim, S., Oommen, B.J.: On Using Prototype Reduction Schemes and Classifier Fusion Strategies to Optimize Kernel-Based Nonlinear Subspace Methods. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 455–460 (2005)
Article Google Scholar
Triguero, I., Derrac, J., Garcia, S., Herrera, F.: A Taxonomy and Experimental Study on Prototype Generation for Nearest Neighbor Classification. IEEE Transactions on Systems, Man and Cybernetics - Part C: Applications and Reviews 42, 86–100 (2012)
Article Google Scholar
Duch, W.: Similarity Based Methods: A General Framework for Classification, Approximation and Association. Control and Cybernetics 29(4), 937–968 (2000)
MathSciNet MATH Google Scholar
Foody, G.M.: Issues in Training Set Selection and Refinement for Classification by a Feedforward Neural Network. In: Proceedings of IEEE International Geoscience and Remote Sensing Symposium, pp. 409–411 (1998)
Google Scholar
Foody, G.M.: The Significance of Border Training Patterns in Classification by a Feedforward Neural Network using Back Propogation Learning. International Journal of Remote Sensing 20(18), 3549–3562 (1999)
Article Google Scholar
Li, G., Japkowicz, N., Stocki, T.J., Ungar, R.K.: Full Border Identification for Reduction of Training Sets. In: Bergler, S. (ed.) Canadian AI 2008. LNCS (LNAI), vol. 5032, pp. 203–215. Springer, Heidelberg (2008)
Chapter Google Scholar
Oommen, B.J., Thomas, A.: Optimal Order Statistics-based “Anti-Bayesian” Parametric Pattern Classification for the Exponential Family. Pattern Recognition (2013) (accepted for Publication)
Google Scholar
Thomas, A., Oommen, B.J.: The Fundamental Theory of Optimal “Anti-Bayesian” Parametric Pattern Classification Using Order Statistics Criteria. Pattern Recognition 46, 376–388 (2013)
Article MATH Google Scholar
Thomas, A., Oommen, B.J.: Order Statistics-based Parametric Classification for Multi-dimensional Distributions (submitted for publication 2013)
Google Scholar
Kim, S., Oommen, B.J.: A brief Taxonomy and Ranking of Creative Prototype Reduction Schemes. Pattern Analysis and Applications 6, 232–244 (2003)
Article MathSciNet Google Scholar
Devroye, L.: Non-Uniform Random Variate Generation. Springer, New York (1986)
Book MATH Google Scholar
Fukunaga, K.: Introduction to Statistical Pattern Recognition, 2nd edn. Academic Press, San Diego (1990)
MATH Google Scholar
Frank, A., Asuncion, A.: UCI Machine Learning Repository (2010), http://archive.ics.uci.edu/ml (April 18, 2013)
http://www.is.umk.pl/projects/datasets.html (April 18, 2013)
Karegowda, A.G., Jayaram, M.A., Manjunath, A.S.: Cascading K-means Clustering and k-Nearest Neighbor Classifier for Categorization of Diabetic Patients. International Journal of Engineering and Advanced Technonlogy 01, 147–151 (2012)
Google Scholar
Salama, G.I., Abdelhalim, M.B., Elghany Zeid, M.A.: Breast Cancer Diagnosis on Three Different Datasets using Multi-classifiers. International Journal of Computer and Information Technology 01, 36–43 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Carleton University, Ottawa, Canada, K1S 5B6
Anu Thomas & B. John Oommen

Authors

Anu Thomas
View author publications
You can also search for this author in PubMed Google Scholar
B. John Oommen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Information Science, University of Otago, 9054, Dunedin, New Zealand
Stephen Cranefield
Macquarie University, 2109, Sydney, NSW, Australia
Abhaya Nayak

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Thomas, A., Oommen, B.J. (2013). Ultimate Order Statistics-Based Prototype Reduction Schemes. In: Cranefield, S., Nayak, A. (eds) AI 2013: Advances in Artificial Intelligence. AI 2013. Lecture Notes in Computer Science(), vol 8272. Springer, Cham. https://doi.org/10.1007/978-3-319-03680-9_42

Download citation

DOI: https://doi.org/10.1007/978-3-319-03680-9_42
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03679-3
Online ISBN: 978-3-319-03680-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics