Skip to main content

Dimensionality of Random Subspaces

  • Conference paper
Classification — the Ubiquitous Challenge

Abstract

Significant improvement of classification accuracy can be obtained by aggregation of multiple models. Proposed methods in this field are mostly based on sampling cases from the training set, or changing weights for cases. Reduction of classification error can also be achieved by random selection of variables to the training subsamples or directly to the model. In this paper we propose a method of feature selection for ensembles that significantly reduces the dimensionality of the subspaces.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • AMIT, Y. and GEMAN, G. (2001): Multiple Randomized Classifiers: MRCL. Technical Report, Department of Statistics, University of Chicago, Chicago.

    Google Scholar 

  • BAUER, E. and KOHAVI R. (1999): An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants. Machine Learning, 36, 105–142.

    Article  Google Scholar 

  • BLAKE, C., KEOGH, E. and MERZ, C. J. (1998): UCI Repository of Machine Learning Databases. Department of Information and Computer Science, University of California, Irvine.

    Google Scholar 

  • BREIMAN, L. (1996): Bagging predictors. Machine Learning, 24, 123–140.

    MATH  MathSciNet  Google Scholar 

  • BREIMAN, L. (1998): Arcing classifiers. Annals of Statistics, 26, 801–849.

    Article  MATH  MathSciNet  Google Scholar 

  • BREIMAN, L. (1999): Using adaptive bagging to debias regressions. Technical Report 547, Department of Statistics, University of California, Berkeley.

    Google Scholar 

  • BREIMAN, L. (2001): Random Forests. Machine Learning 45, 5–32.

    Article  MATH  Google Scholar 

  • DIETTERICH, T. (2000): An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting and Randomization. Machine Learning, 40, 139–158.

    Article  Google Scholar 

  • DIETTERICH, T. and BAKIRI, G. (1995): Solving multiclass learning problem via error-correcting output codes. Journal of Artificial Intelligence Research, 2, 263–286.

    Google Scholar 

  • FAYYAD, U.M. and IRANI, K.B. (1993): Multi-interval discretisation of continuous-valued attributes. In: Proceedings of the XIII International Joint Conference on Artificial Intelligence, Morgan Kaufmann, San Francisco, 1022–1027.

    Google Scholar 

  • FREUND, Y. and SCHAPIRE, R.E. (1997): A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences, 55, 119–139.

    Article  MathSciNet  Google Scholar 

  • HALL, M. (2000): Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the 17th International Conference on Machine Learning, Morgan Kaufmann, San Francisco.

    Google Scholar 

  • HELLWIG, Z. (1969): On the problem of optimal selection of predictors. Statistical Revue, 3–4 (in Polish).

    Google Scholar 

  • HO, T.K. (1998): The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 832–844.

    Article  Google Scholar 

  • HONG, S.J. (1997): Use of contextual information for feature ranking and discretization. IEEE Transactions on Knowledge and Data Engineering, 9, 718–730.

    Article  Google Scholar 

  • KIRA, A. and RENDELL, L. (1992): A practical approach to feature selection. In: D. Sleeman and P. Edwards (Eds.): Proceedings of the 9th International Conference on Machine Learning, Morgan Kaufmann, San Francisco, 249–256.

    Google Scholar 

  • KOHAVI, R. and JOHN, G.H. (1997): Wrappers for feature subset selection. Artificial Intelligence, 97, 273–324.

    Article  Google Scholar 

  • OZA, N.C. and TUMAR, K. (1999): Dimensionality reduction thrrough classifier ensembles. Technical Report NASA-ARC-IC-1999-126, Computational Sciences Division, NASA Ames Research Center.

    Google Scholar 

  • PRESS, W.H., FLANNERY, B.P., TEUKOLSKY, S.A., VETTERLING, W.T. (1989): Numerical recipes in Pascal. Cambridge University Press, Cambridge.

    Google Scholar 

  • QUINLAN, J.R. (1993): C4.5: Programs for Machine Learning, Morgan Kaufmann, San Mateo.

    Google Scholar 

  • THERNEAU, T.M. and ATKINSON, E.J. (1997): An introduction to recursive partitioning using the RPART routines, Mayo Foundation, Rochester.

    Google Scholar 

  • WOLPERT, D. (1992): Stacked generalization. Neural Networks, 5, 241–259.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin · Heidelberg

About this paper

Cite this paper

Gatnar, E. (2005). Dimensionality of Random Subspaces. In: Weihs, C., Gaul, W. (eds) Classification — the Ubiquitous Challenge. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-28084-7_12

Download citation

Publish with us

Policies and ethics