Summary
When popular classifiers fail to perform to perfect accuracy in a practical application, possible causes can be deficiencies in the algorithms, intrinsic difficulties in the data, and a mismatch between methods and problems. We propose to address this mystery by developing measures of geometrical and topological characteristics of point sets in high-dimensional spaces. Such measures provide a basis for analyzing classifier behavior beyond estimates of error rates. We discuss several measures useful for this characterization, and their utility in analyzing data sets with known or controlled complexity. Our observations confirm their effectiveness and suggest several future directions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
M. Basu, T.K. Ho. The learning behavior of single neuron classifiers on linearly separable or nonseparable input. Proc. of the 1999 International Joint Conference on Neural Networks, Washington, DC, July 1999.
C.L. Blake, C.J. Merz. UCI Repository of machine learning databases. http://www.ics.uci.edu/ mlearn/MLRepository.html, University of California, Department of Information and Computer Science, Irvine, CA, 1998.
L. Devroye. Automatic pattern recognition: A study of the probability of error. IEEE Transactions on Pattern Analysis and Machine Intelligence, 10(4), 530–543, 1988.
A. Elisseeff, Y. Guermeur, H. Paugam-Moisy. Margin error and generalization capabilities of multiclass discriminant systems. European Working Group NeuroCOLT 2, Technical Report NC2-TR-1999-051, http://www.neurocolt.com/abs/1999/abs99051.html, 1999.
T.K. Ho, H.S. Baird. Pattern classification with compact distribution maps. Computer Vision and Image Understanding, 70(1), 101–110, April 1998.
T.K. Ho. Multiple classifier combination: lessons and next steps. In A. Kandel, H. Bunke, eds., Hybrid Methods in Pattern Recognition. Singapore: World Scientific, 2002.
T.K. Ho, M. Basu. Measuring the complexity of classification problems. Proc. of the 15th International Conference on Pattern Recognition, Barcelona, Spain, September 3–8, 2000, 43–47.
T.K. Ho, M. Basu. Complexity measures of supervised classification problems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3), 289–300, 2002.
T.K. Ho. A data complexity analysis of comparative advantages of decision forest constructors, Pattern Analysis and Applications. 5, 102–112, 2002.
T.K. Ho. Exploratory analysis of point proximity in subspaces. Proc. of the 16th International Conference on Pattern Recognition, Quebec City, Canada, August 11–15, 2002.
A. Hoekstra, R.P.W. Duin. On the nonlinearity of pattern classifiers. Proc. of the 13th International Conference on Pattern Recognition, Vienna, August 1996, D271-275.
A.N. Kolmogorov. Three approaches to the quantitative definition of information. Problems of Information Transmission, 1, 4–7, 1965.
F. Lebourgeois, H. Emptoz. Pretopological approach for supervised learning. Proc. of the 13th International Conference on Pattern Recognition, Vienna, 256–260, 1996.
M. Li, P. Vitanyi. An Introduction to Kolmogorov Complexity and Its Applications. New York: Springer-Verlag, 1993.
J.M. Maciejowski. Model discrimination using an algorithmic information criterion. Automatica, 15, 579–593, 1979.
E.B. Mansilla, T.K. Ho. Domain of competence of XCS classifier system in complexity measurement space. IEEE Transactions on Evolutionary Computation, 9(1), 82–104, February 2005.
S. Raudys, A.K. Jain. Small sample size effects in statistical pattern recognition: recommendations for practitioners. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(3), 252–264, 1991.
F.W. Smith. Pattern classifier design by linear programming. IEEE Transactions on Computers, C-17(4), 367–372, April 1968.
S. Singh. Multiresolution estimates of classification complexity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(12), 1534–1539, 2003.
V. Vapnik. Statistical Learning Theory. New York: John Wiley & Sons, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer Verlag London Limited
About this chapter
Cite this chapter
Ho, T.K., Basu, M., Law, M.H.C. (2006). Measures of Geometrical Complexity in Classification Problems. In: Basu, M., Ho, T.K. (eds) Data Complexity in Pattern Recognition. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-84628-172-3_1
Download citation
DOI: https://doi.org/10.1007/978-1-84628-172-3_1
Publisher Name: Springer, London
Print ISBN: 978-1-84628-171-6
Online ISBN: 978-1-84628-172-3
eBook Packages: Computer ScienceComputer Science (R0)