Skip to main content

Fast-Ensembles of Minimum Redundancy Feature Selection

  • Chapter
Ensembles in Machine Learning Applications

Part of the book series: Studies in Computational Intelligence ((SCI,volume 373))

Abstract

Finding relevant subspaces in very high-dimensional data is a challenging task not only for microarray data. The selection of features is to enhance the classification performance, but on the other hand the feature selection must be stable, i.e., the set of features selected should not change when using different subsets of a population. ensemble methods have succeeded in the increase of stability and classification accuracy. However, their runtime prevents them from scaling up to real-world applications.We propose two methods which enhance correlation-based feature selection such that the stability of feature selection comes with little or even no extra runtime.We show the efficiency of the algorithms analytically and empirically on a wide range of datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bontempi, G., Meyer, P.E.: Causal filter selection in microarray data. In: Fürnkranz, J., Joachims, T. (eds.) Proc. the 27th Int. Conf. Machine Learning, Haifa, Israel, pp. 95–102. Omnipress, Madison (2010)

    Google Scholar 

  2. Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)

    MathSciNet  MATH  Google Scholar 

  3. Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  4. Ding, C.H.Q., Peng, H.: Minimum redundancy feature selection from microarray gene expression data. In: Proc. the 2nd IEEE Comp. Society Bioinformatics Conf., Stanford, CA, pp. 523–529. IEEE Comp. Society, Los Alamitos (2003)

    Google Scholar 

  5. Fox, R.J., Dimmic, M.W.: A two-sample Bayesian t-test for microarray data. BMC Bioinformatics 7 (2006)

    Google Scholar 

  6. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 119–139 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  7. Gulgezen, G., Cataltepe, Z., Yu, L.: Stable and accurate feature selection. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009. LNCS, vol. 5781, pp. 455–468. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  8. Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: Langley, P. (ed.) Proc. the 17th Int. Conf. Machine Learning, Stanford, CA, pp. 359–366. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  9. Han, Y., Yu, L.: A variance reduction framework for stable feature selection. In: Webb, G.I., Liu, B., Zhang, C., Gunopulos, D., Wu, X. (eds.) Proc. the 10th IEEE Int. Conf. Data Mining, Sydney, Australia, pp. 206–215. IEEE Computer Society, Los Alamitos (2010)

    Chapter  Google Scholar 

  10. Jurman, G., Merler, S., Barla, A., Paoli, S., Galea, A., Furlanello, C.: Algebraic stability indicators for ranked lists in molecular profiling. Bioinformatics 24, 258–264 (2008)

    Article  Google Scholar 

  11. Kalousis, A., Prados, J., Hilario, M.: Stability of feature selection algorithms: a study on high-dimensional spaces. Knowledge and Inf. Syst. 12, 95–116 (2007)

    Article  Google Scholar 

  12. Koh, J.L.Y., Li Lee, M., Hsu, W., Lam, K.-T.: Correlation-based detection of attribute outliers. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 164–175. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  13. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)

    Article  MATH  Google Scholar 

  14. Kuncheva, L.I.: A stability index for feature selection. In: Devedzic, V. (ed.) IASTED Int. Conf. Artif. Intell. and Appl., Innsbruck, Austria, pp. 421–427. ACTA Press, Calgary (2007)

    Google Scholar 

  15. Michalak, K., Kwasnicka, H.: Correlation-based feature selection strategy in neural classification. In: Proc. the 6th Int. Conf. Intell. Syst. Design and Appl., Jinan, China, pp. 741–746. IEEE Comp. Society, Los Alamitos (2006)

    Google Scholar 

  16. Saeys, Y., Abeel, T., Van de Peer, Y.: Robust feature selection using ensemble feature selection techniques. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 313–325. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  17. Tusher, V.G., Tibshirani, R., Chu, G.: Significance analysis of microarrays applied to the ionizing radiation response. Proc. the National Academy of Sciences of the United States of America  98, 5116–5121 (2001)

    Google Scholar 

  18. Vapnik, V.: Statistical learning theory. Wiley, Chichester (1998)

    MATH  Google Scholar 

  19. Xu, X., Zhang, A.: Boost feature subset selection: A new gene selection algorithm for microarray dataset. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3992, pp. 670–677. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  20. Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Machine Learning Research 5, 1205–1224 (2004)

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Schowe, B., Morik, K. (2011). Fast-Ensembles of Minimum Redundancy Feature Selection. In: Okun, O., Valentini, G., Re, M. (eds) Ensembles in Machine Learning Applications. Studies in Computational Intelligence, vol 373. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22910-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22910-7_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22909-1

  • Online ISBN: 978-3-642-22910-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics