Measures of Geometrical Complexity in Classification Problems

Ho, Tin Kam; Basu, Mitra; Law, Martin Hiu Chung

doi:10.1007/978-1-84628-172-3_1

Tin Kam Ho³,
Mitra Basu⁴ &
Martin Hiu Chung Law⁵

Part of the book series: Advanced Information and Knowledge Processing ((AI&KP))

1267 Accesses
26 Citations

Summary

When popular classifiers fail to perform to perfect accuracy in a practical application, possible causes can be deficiencies in the algorithms, intrinsic difficulties in the data, and a mismatch between methods and problems. We propose to address this mystery by developing measures of geometrical and topological characteristics of point sets in high-dimensional spaces. Such measures provide a basis for analyzing classifier behavior beyond estimates of error rates. We discuss several measures useful for this characterization, and their utility in analyzing data sets with known or controlled complexity. Our observations confirm their effectiveness and suggest several future directions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

M. Basu, T.K. Ho. The learning behavior of single neuron classifiers on linearly separable or nonseparable input. Proc. of the 1999 International Joint Conference on Neural Networks, Washington, DC, July 1999.
Google Scholar
C.L. Blake, C.J. Merz. UCI Repository of machine learning databases. http://www.ics.uci.edu/ mlearn/MLRepository.html, University of California, Department of Information and Computer Science, Irvine, CA, 1998.
Google Scholar
L. Devroye. Automatic pattern recognition: A study of the probability of error. IEEE Transactions on Pattern Analysis and Machine Intelligence, 10(4), 530–543, 1988.
Article Google Scholar
A. Elisseeff, Y. Guermeur, H. Paugam-Moisy. Margin error and generalization capabilities of multiclass discriminant systems. European Working Group NeuroCOLT 2, Technical Report NC2-TR-1999-051, http://www.neurocolt.com/abs/1999/abs99051.html, 1999.
Google Scholar
T.K. Ho, H.S. Baird. Pattern classification with compact distribution maps. Computer Vision and Image Understanding, 70(1), 101–110, April 1998.
Article Google Scholar
T.K. Ho. Multiple classifier combination: lessons and next steps. In A. Kandel, H. Bunke, eds., Hybrid Methods in Pattern Recognition. Singapore: World Scientific, 2002.
Google Scholar
T.K. Ho, M. Basu. Measuring the complexity of classification problems. Proc. of the 15th International Conference on Pattern Recognition, Barcelona, Spain, September 3–8, 2000, 43–47.
Google Scholar
T.K. Ho, M. Basu. Complexity measures of supervised classification problems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3), 289–300, 2002.
Article Google Scholar
T.K. Ho. A data complexity analysis of comparative advantages of decision forest constructors, Pattern Analysis and Applications. 5, 102–112, 2002.
Article Google Scholar
T.K. Ho. Exploratory analysis of point proximity in subspaces. Proc. of the 16th International Conference on Pattern Recognition, Quebec City, Canada, August 11–15, 2002.
Google Scholar
A. Hoekstra, R.P.W. Duin. On the nonlinearity of pattern classifiers. Proc. of the 13th International Conference on Pattern Recognition, Vienna, August 1996, D271-275.
Google Scholar
A.N. Kolmogorov. Three approaches to the quantitative definition of information. Problems of Information Transmission, 1, 4–7, 1965.
MathSciNet Google Scholar
F. Lebourgeois, H. Emptoz. Pretopological approach for supervised learning. Proc. of the 13th International Conference on Pattern Recognition, Vienna, 256–260, 1996.
Google Scholar
M. Li, P. Vitanyi. An Introduction to Kolmogorov Complexity and Its Applications. New York: Springer-Verlag, 1993.
MATH Google Scholar
J.M. Maciejowski. Model discrimination using an algorithmic information criterion. Automatica, 15, 579–593, 1979.
Article MathSciNet Google Scholar
E.B. Mansilla, T.K. Ho. Domain of competence of XCS classifier system in complexity measurement space. IEEE Transactions on Evolutionary Computation, 9(1), 82–104, February 2005.
Article Google Scholar
S. Raudys, A.K. Jain. Small sample size effects in statistical pattern recognition: recommendations for practitioners. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(3), 252–264, 1991.
Article Google Scholar
F.W. Smith. Pattern classifier design by linear programming. IEEE Transactions on Computers, C-17(4), 367–372, April 1968.
Google Scholar
S. Singh. Multiresolution estimates of classification complexity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(12), 1534–1539, 2003.
Article Google Scholar
V. Vapnik. Statistical Learning Theory. New York: John Wiley & Sons, 1998.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Mathematical & Algorithmic Sciences Research Center, Bell Laboratories, Lucent Technologies, Murray Hill, NJ, 07974-0636, USA
Tin Kam Ho
National Science Foundation, 4201 Wilson Blvd., Arlington, VA, 22230, USA
Mitra Basu
Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA
Martin Hiu Chung Law

Authors

Tin Kam Ho
View author publications
You can also search for this author in PubMed Google Scholar
Mitra Basu
View author publications
You can also search for this author in PubMed Google Scholar
Martin Hiu Chung Law
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Electrical Engineering Department, City College, City University of New York, USA
Mitra Basu PhD
Bell Laboratories, Lucent Technologies, New Jersey, USA
Tin Kam Ho BBA, MS, PhD

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Ho, T.K., Basu, M., Law, M.H.C. (2006). Measures of Geometrical Complexity in Classification Problems. In: Basu, M., Ho, T.K. (eds) Data Complexity in Pattern Recognition. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-84628-172-3_1

Download citation

DOI: https://doi.org/10.1007/978-1-84628-172-3_1
Publisher Name: Springer, London
Print ISBN: 978-1-84628-171-6
Online ISBN: 978-1-84628-172-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics