Summary
The size of the training set is important in characterizing data complexity. If a standard Fisher linear discriminant function or an Euclidean distance classifier is used to classify two multivariate Gaussian populations sharing a common covariance matrix, several measures of data complexity play an important role. The types of potential classification rules cannot be ignored while determining the data complexity. The three factors — sample size, data complexity, and classifier complexity—are mutually dependent. In situations where many classifiers are potentially useful, exact characterization of the data complexity requires a greater number of characteristics.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
S. Amari, N. Fujita, S. Shinomoto. Four types of learning curves. Neural Computation, 4, 605–618, 1992.
M. Basu, T.K. Ho. The learning behavior of single neuron classifiers on linearly separable or nonseparable input. Proc. of IEEE Intl. Joint Conf. on Neural Networks, July 10–16, 1999, Washington, DC.
J. Cid-Sueiro, J.L. Sancho-Gomez. Saturated perceptrons for maximum margin and minimum misclassification error. Neural Processing Letters, 14, 217–226, 2001.
R.O. Duda, P.E. Hart, D.G. Stork. Pattern Classification and Scene Analysis. 2nd ed. New York: John Wiley, 2000.
K. Fukunaga. Introduction to Statistical Pattern Recognition. 2nd ed. New York: Academic Press, 1990.
T.K. Ho, M. Basu. Complexity measures of supervised classification problems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 289–300, 2002.
Y.S. Huang, C.Y. Suen. A method of combining multiple experts for the recognition of unconstrained handwritten numerals. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(1), 90–94, 1995.
M. Li, P. Vitanyi. An Introduction to Kolmogorov Complexity and its Applications. New York: Springer, 1993.
S. Raudys. On the problems of sample size in pattern recognition. In V. S. Pugatchiov, ed. Detection, Pattern Recognition and Experiment Design, volume 2, pages 64–76. Proc. of the 2nd All-Union Conference Statistical Methods in Control Theory. Moscow: Nauka, 1970 (in Russian).
S. Raudys. On dimensionality, sample size and classification error of nonparametric linear classification algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 669–671, 1997.
S. Raudys. Evolution and generalization of a single neuron. I. SLP as seven statistical classifiers. Neural Networks, 11, 283–296, 1998.
S. Raudys. Statistical and Neural Classifiers: An Integrated Approach to Design. New York: Springer-Verlag, 2001.
S. Raudys, A. Saudargiene. Tree type dependency model and sample size-dimensionality properties. IEEE Transaction on Pattern Analysis and Machine Intelligence, 23, 233–239, 2001.
S. Raudys. Integration of statistical and neural methods to design classifiers in case of unequal covariance matrices. Lecture Notes in Computer Science, New York: Springer, 3238, 270–280, 2004.
S. Raudys, D. Young. Results in statistical discriminant analysis: A review of the former Soviet Union literature. Journal of Multivariate Analysis, 89, 1–35, 2004.
A. Saudargiene. Structurization of the covariance matrix by process type and block diagonal models in the classifier design. Informatica 10(2), 245–269, 1999.
V. N. Vapnik. The Nature of Statistical Learning Theory. New York: Springer, 1995.
V.I. Zarudskij. The use of models of simple dependence problems of classification. In S. Raudys, ed. Statistical Problems of Control, volume 38, pages 33–75, Vilnius: Institute of Mathematics and Informatics, 1979 (in Russian).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer Verlag London Limited
About this chapter
Cite this chapter
Raudys, Š. (2006). Measures of Data and Classifier Complexity and the Training Sample Size. In: Basu, M., Ho, T.K. (eds) Data Complexity in Pattern Recognition. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-84628-172-3_3
Download citation
DOI: https://doi.org/10.1007/978-1-84628-172-3_3
Publisher Name: Springer, London
Print ISBN: 978-1-84628-171-6
Online ISBN: 978-1-84628-172-3
eBook Packages: Computer ScienceComputer Science (R0)