Abstract
Significant improvement of classification accuracy can be obtained by aggregation of multiple models. Proposed methods in this field are mostly based on sampling cases from the training set, or changing weights for cases. Reduction of classification error can also be achieved by random selection of variables to the training subsamples or directly to the model. In this paper we propose a method of feature selection for ensembles that significantly reduces the dimensionality of the subspaces.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
AMIT, Y. and GEMAN, G. (2001): Multiple Randomized Classifiers: MRCL. Technical Report, Department of Statistics, University of Chicago, Chicago.
BAUER, E. and KOHAVI R. (1999): An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants. Machine Learning, 36, 105–142.
BLAKE, C., KEOGH, E. and MERZ, C. J. (1998): UCI Repository of Machine Learning Databases. Department of Information and Computer Science, University of California, Irvine.
BREIMAN, L. (1996): Bagging predictors. Machine Learning, 24, 123–140.
BREIMAN, L. (1998): Arcing classifiers. Annals of Statistics, 26, 801–849.
BREIMAN, L. (1999): Using adaptive bagging to debias regressions. Technical Report 547, Department of Statistics, University of California, Berkeley.
BREIMAN, L. (2001): Random Forests. Machine Learning 45, 5–32.
DIETTERICH, T. (2000): An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting and Randomization. Machine Learning, 40, 139–158.
DIETTERICH, T. and BAKIRI, G. (1995): Solving multiclass learning problem via error-correcting output codes. Journal of Artificial Intelligence Research, 2, 263–286.
FAYYAD, U.M. and IRANI, K.B. (1993): Multi-interval discretisation of continuous-valued attributes. In: Proceedings of the XIII International Joint Conference on Artificial Intelligence, Morgan Kaufmann, San Francisco, 1022–1027.
FREUND, Y. and SCHAPIRE, R.E. (1997): A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences, 55, 119–139.
HALL, M. (2000): Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the 17th International Conference on Machine Learning, Morgan Kaufmann, San Francisco.
HELLWIG, Z. (1969): On the problem of optimal selection of predictors. Statistical Revue, 3–4 (in Polish).
HO, T.K. (1998): The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 832–844.
HONG, S.J. (1997): Use of contextual information for feature ranking and discretization. IEEE Transactions on Knowledge and Data Engineering, 9, 718–730.
KIRA, A. and RENDELL, L. (1992): A practical approach to feature selection. In: D. Sleeman and P. Edwards (Eds.): Proceedings of the 9th International Conference on Machine Learning, Morgan Kaufmann, San Francisco, 249–256.
KOHAVI, R. and JOHN, G.H. (1997): Wrappers for feature subset selection. Artificial Intelligence, 97, 273–324.
OZA, N.C. and TUMAR, K. (1999): Dimensionality reduction thrrough classifier ensembles. Technical Report NASA-ARC-IC-1999-126, Computational Sciences Division, NASA Ames Research Center.
PRESS, W.H., FLANNERY, B.P., TEUKOLSKY, S.A., VETTERLING, W.T. (1989): Numerical recipes in Pascal. Cambridge University Press, Cambridge.
QUINLAN, J.R. (1993): C4.5: Programs for Machine Learning, Morgan Kaufmann, San Mateo.
THERNEAU, T.M. and ATKINSON, E.J. (1997): An introduction to recursive partitioning using the RPART routines, Mayo Foundation, Rochester.
WOLPERT, D. (1992): Stacked generalization. Neural Networks, 5, 241–259.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin · Heidelberg
About this paper
Cite this paper
Gatnar, E. (2005). Dimensionality of Random Subspaces. In: Weihs, C., Gaul, W. (eds) Classification — the Ubiquitous Challenge. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-28084-7_12
Download citation
DOI: https://doi.org/10.1007/3-540-28084-7_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25677-9
Online ISBN: 978-3-540-28084-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)