Skip to main content

Random Projection Ensemble Classifiers

  • Conference paper

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 24))

Abstract

We introduce a novel ensemble model based on random projections. The contribution of using random projections is two-fold. First, the randomness provides the diversity which is required for the construction of an ensemble model. Second, random projections embed the original set into a space of lower dimension while preserving the dataset’s geometrical structure to a given distortion. This reduces the computational complexity of the model construction as well as the complexity of the classification. Furthermore, dimensionality reduction removes noisy features from the data and also represents the information which is inherent in the raw data by using a small number of features. The noise removal increases the accuracy of the classifier.

The proposed scheme was tested using WEKA based procedures that were applied to 16 benchmark dataset from the UCI repository.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alonso, C.J.: Rotation forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(10), 1619–1630 (2006)

    Article  Google Scholar 

  2. Asuncion, A., Newman, D.J.: UCI machine learning repository (2007)

    Google Scholar 

  3. Bingham, E., Mannila, H.: Random projection in dimensionality reduction: applications to image and text data. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2001), San Francisco, CA, USA, August 26-29, 2001, pp. 245–250 (2001)

    Google Scholar 

  4. Bourgain, J.: On lipschitz embedding of finite metric spaces in Hilbert space. Israel Journal of Mathematics 52, 46–52 (1985)

    Article  Google Scholar 

  5. Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)

    Google Scholar 

  6. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman & Hall, Inc., New York (1993)

    Google Scholar 

  7. Candès, E., Romberg, J., Tao, T.: Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory 52(2), 489–509 (2006)

    Article  Google Scholar 

  8. Demsar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)

    Google Scholar 

  9. Donoho, D.L.: Compressed sensing. IEEE Transactions on Information Theory 52(4), 1289–1306 (2006)

    Article  Google Scholar 

  10. Zhang Fern, X., Brodley, C.E.: Random projection for high dimensional data clustering: A cluster ensemble approach, pp. 186–193 (2003)

    Google Scholar 

  11. Folgieri, R.: Ensembles based on Random Projection for gene expression data analysis. PhD thesis, University of Milano (2007)

    Google Scholar 

  12. Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. machine learning. In: Proceedings for the Thirteenth International Conference, pp. 148–156. Morgan Kaufmann, San Francisco (1996)

    Google Scholar 

  13. Goel, N., Bebis, G., Nefian, A.: Face recognition experiments with random projection. In: Proceedings of SPIE, vol. 5779, p. 426 (2005)

    Google Scholar 

  14. Hegde, C., Wakin, M., Baraniuk, R.G.: Random projections for manifold learning. In: Neural Information Processing Systems (NIPS) (December 2007)

    Google Scholar 

  15. Hein, M., Audibert, Y.: Intrinsic dimensionality estimation of submanifolds in Euclidean space. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 289–296 (2005)

    Google Scholar 

  16. Ho, T.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 832–844 (1998)

    Article  Google Scholar 

  17. Johnson, W.B., Lindenstrauss, J.: Extensions of Lipshitz mapping into Hilbert space. Contemporary Mathematics 26, 189–206 (1984)

    Google Scholar 

  18. Kuncheva, L.I.: Combining Pattern Classifiers. Methods and Algorithms. John Wiley and Sons, Chichester (2004)

    Book  Google Scholar 

  19. Kuncheva, L.I.: Diversity in multiple classifier systems (editorial). Information Fusion 6(1), 3–4 (2004)

    Article  Google Scholar 

  20. Leigh, W., Purvis, R., Ragusa, J.M.: Forecasting the nyse composite index with technical analysis, pattern recognizer, neural networks, and genetic algorithm: a case study in romantic decision support. Decision Support Systems 32(4), 361–377 (2002)

    Article  Google Scholar 

  21. Linial, M., Linial, N., Tishby, N., Yona, G.: Global self-organization of all known protein sequences reveals inherent biological signatures. Journal of Molecular Biology 268(2), 539–556 (1997)

    Article  Google Scholar 

  22. Mangiameli, P., West, D., Rampal, R.: Model selection for medical diagnosis decision support systems. Decision Support Systems 36(3), 247–259 (2004)

    Article  Google Scholar 

  23. Margineantu, D.D., Dietterich, T.G.: Pruning adaptive boosting. In: Proceedings of the 14th International Conference on Machine Learning, pp. 211–218 (1997)

    Google Scholar 

  24. Polikar, R.: Ensemble based systems in decision making. IEEE Circuits and Systems Magazine 6(3), 21–45 (2006)

    Article  Google Scholar 

  25. Quinlan, R.R.: C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)

    Google Scholar 

  26. Rokach, L.: Mining manufacturing data using genetic algorithm-based feature set decomposition. International Journal of Intelligent Systems Technologies and Applications 4(1/2), 57–78 (2008)

    Article  Google Scholar 

  27. Rooney, N., Patterson, D., Tsymbal, A., Anand, S.: Random subspacing for regression ensembles. Technical report, Department of Computer Science, Trinity College Dublin, Ireland, February 10 (2004)

    Google Scholar 

  28. Valentini, G., Muselli, M., Ruffino, F.: Bagged ensembles of svms for gene expression data analysis. In: Proceeding of the International Joint Conference on Neural Networks - IJCNN, pp. 1844–1849. IEEE Computer Society Press, Los Alamitos (2003)

    Chapter  Google Scholar 

  29. Vapnik, V.N.: The Nature of Statistical Learning Theory (Information Science and Statistics). Springer, Heidelberg (1999)

    Google Scholar 

  30. Yang, Z., Nie, X., Xu, W., Guo, J.: An approach to spam detection by naive bayes ensemble based on decision induction. In: Proceedings of the Sixth International Conference on Intelligent Systems Design and Applications (ISDA 2006) (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Schclar, A., Rokach, L. (2009). Random Projection Ensemble Classifiers. In: Filipe, J., Cordeiro, J. (eds) Enterprise Information Systems. ICEIS 2009. Lecture Notes in Business Information Processing, vol 24. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01347-8_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01347-8_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01346-1

  • Online ISBN: 978-3-642-01347-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics