Skip to main content

Rotation Forest on Microarray Domain: PCA versus ICA

  • Conference paper
  • 1219 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6097))

Abstract

Rotation Forest (RF) is an ensemble method that has shown effectiveness on microarray data set classification problems. RF works by generating sparse rotation matrixes of the input space, a method that creates accurate and diverse base classifiers. In its original formulation, elemental rotations were obtained by Principal Component Analysis (PCA). However, for microarray data sets, Independent Component Analysis (ICA) may be a better option. In this paper, an experimental study on ten microarray data sets has been performed. The study confirms that, except for a small number of attributes, Rotation Forest outperforms Bagging and Boosting on this domain. However, RF with ICA does not generally improve on RF with PCA.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)

    MATH  Google Scholar 

  2. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  3. Fern, X.Z., Broadley, C.E.: Random projection for high dimensional data clustering: A cluster ensemble approach. In: Proc. 20th International Conference on Machine Learning, ICML, pp. 186–193 (2003)

    Google Scholar 

  4. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. of Computer and System Sciences 55(1), 119–139 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  5. Fukunaga, K., Mantock, J.: Nonparametric discriminant analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 5(3), 671–678 (1983)

    Article  MATH  Google Scholar 

  6. Garcia, S., Herrera, F.: An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. Journal of Machine Learning Research 9, 2677–2694 (2008)

    MATH  Google Scholar 

  7. Golub, T.R., Stomin, D.K., Tamayo, P.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)

    Article  Google Scholar 

  8. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422 (2002)

    Article  MATH  Google Scholar 

  9. Han, J., Kanber, M.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2006)

    Google Scholar 

  10. Hyvärinen, A., Oja, E.: Independent component analysis: algorithms and applications. Neural Networks 14(4-5), 411–430 (2000)

    Article  Google Scholar 

  11. Kuncheva, L.I., Rodríguez, J.J.: An experimental study on rotation forest ensembles. In: Haindl, M., Kittler, J., Roli, F. (eds.) MCS 2007. LNCS, vol. 4472, pp. 459–468. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  12. Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. Wiley Interscience, Hoboken (2004)

    Book  MATH  Google Scholar 

  13. Lambertz, M.: Fastica for java (2006), http://sourceforge.net/projects/fastica/

  14. Lee, S., Batzoglou, S.: Application of independent component analysis to microarrays. Genome Biology 4(11) (2003)

    Google Scholar 

  15. Li, W., Yang, Y.: How many genes are needed for a discriminant microarray data analysis? In: Critical Assessment of Techniques for Microarray Data Mining Workshop, pp. 137–150 (2000)

    Google Scholar 

  16. Liebermeister, W.: Linear modes of gene expressions determined by independent component analysis. Bioinformatics 18, 51–56 (2002)

    Article  Google Scholar 

  17. Liu, K., Huang, D.: Cancer classification using rotation forest. Computers in Biology and Medicine 38, 601–610 (2008)

    Article  Google Scholar 

  18. Nadeau, C., Bengio, Y.: Inference for the generalization error. Machine Learning 52(3), 239–281 (2003)

    Article  MATH  Google Scholar 

  19. Nanni, L., Lumini, A.: Using ensemble of classifiers in Bioinformatics. In: Machine Learning Research Progress. Nova Science publisher (2009)

    Google Scholar 

  20. Ridge, K.: Kent ridge bio-medical dataset (2009), http://datam.i2r.a-star.edu.sg/datasets/krbd/

  21. Rodríguez, J.J., Kuncheva, L.I., Alonso-González, C.J.: Rotation forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(10), 1619–1621 (2006)

    Article  Google Scholar 

  22. Saeys, Y., Inza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23, 2507–2517 (2007)

    Article  Google Scholar 

  23. Stiglic, G., Rodríguez, J.-J., Kokol, P.: Feature selection and classification for small gene sets. In: Chetty, M., Ngom, A., Ahmad, S. (eds.) PRIB 2008. LNCS (LNBI), vol. 5265, pp. 121–131. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  24. Symons, S., Nieselt, K.: Data mining microarray data - Comprehensive benchmarking of feature selection and classification methods, Pre-print, www.zbit.uni-tuebingen.de/pas/preprints/GCB2006/SymonsNieselt.pdf

  25. Tang, Y., Zhang, Y., Huang, Z.: FCM-SVM-RFE gene feature selection algorithm for leukemia classification from microarray gene expression data. In: FUZZ 2005, The 14th IEEE International Conference on Fuzzy Systems, pp. 97–101 (2005)

    Google Scholar 

  26. Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  27. Xiong, M., Fang, Z., Zhao, J.: Biomarker identification by feature wrappers. Genome Research 11, 1878–1887 (2001)

    Google Scholar 

  28. Zhang, X.W., Yap, Y.L., Wei, D., Chen, F., Danchin, A.: Molecular diagnosis of human cancer type by gene expresion profiles and independent component analysis. European J. Human Genetics 13, 1303–1311 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Alonso-González, C.J., Moro-Sancho, Q.I., Ramos-Muñoz, I., Simón-Hurtado, M.A. (2010). Rotation Forest on Microarray Domain: PCA versus ICA. In: García-Pedrajas, N., Herrera, F., Fyfe, C., Benítez, J.M., Ali, M. (eds) Trends in Applied Intelligent Systems. IEA/AIE 2010. Lecture Notes in Computer Science(), vol 6097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13025-0_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13025-0_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13024-3

  • Online ISBN: 978-3-642-13025-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics