Skip to main content

A Cluster-Based Feature Selection Approach

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5572))

Abstract

This paper proposes a filter-based method for feature selection. The filter is based on the partitioning of the feature space into clusters of similar features. The number of clusters and, consequently, the cardinality of the subset of selected features, is automatically estimated from the data. Empirical results illustrate the performance of the proposed algorithm, which in general has obtained competitive results in terms of classification accuracy when compared to a state of the art algorithm for feature selection, but with more modest computing time requirements.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arabie, P., Hubert, L.J.: An Overview of Combinatorial Data Analysis. In: Arabie, P., Hubert, L.J., DeSoete, G. (eds.) Clustering and Classification. World Scientific, Singapore (1999)

    Google Scholar 

  2. Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. University of California, Irvine, http://www.ics.uci.edu/~mlearn/MLRepository.htm

  3. Au, W., Chan, K.C.C., Wong, A.K.C., Wang, Y.: Attribute Clustering for Grouping, Selection, and Classification of Gene Expression Data. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2(2), 83–101 (2005)

    Article  Google Scholar 

  4. Everitt, B.S., Landau, S., Leese, M.: Cluster Analysis. Arnold Publishers, London (2001)

    MATH  Google Scholar 

  5. Guyon, I., Elisseeff, A.: An Introduction to Variable and Feature Selection. Journal of Machine Learning Research 3, 1157–1182 (2003)

    MATH  Google Scholar 

  6. Hruschka, E.R., Campello, R.J.G.B., de Castro, L.N.: Evolving Clusters in Gene-Expression Data. Information Sciences 176(13), 1898–1927 (2006)

    Article  MathSciNet  Google Scholar 

  7. John, G., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Proc. of the Eleventh Int. Conf. on Machine Learning. Morgan Kaufmann, San Francisco (1994)

    Google Scholar 

  8. Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data – An Introduction to Cluster Analysis. Wiley Series in Probability and Mathematical Statistics (1990)

    Google Scholar 

  9. Kohavi, R., John, G.H.: Wrappers for Feature Subset Selection. Artificial Intelligence 97(1-2), 273–324 (1997)

    Article  MATH  Google Scholar 

  10. Koller, D., Sahami, M.: Toward optimal feature selection. In: Proc. of the 13th Int. Conf. on Machine Learning, pp. 284–292 (1996)

    Google Scholar 

  11. Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic Publishers, Dordrecht (1998)

    Book  MATH  Google Scholar 

  12. Liu, H., Yu, L.: Toward Integrating Feature Selection Algorithms for Classification and Clustering. IEEE Transactions on Knowledge and Data Engineering 17(3), 1–12 (2005)

    Article  Google Scholar 

  13. Mitra, P., Murthy, C.A., Pal, S.K.: Unsupervised Feature Selection using Feature Similarity. IEEE Trans. on Pattern Analysis & Machine Intelligence 24(4), 301–312 (2002)

    Article  Google Scholar 

  14. Reunanen, J.: Overfitting in Making Comparisons Between Variable Selection Methods. Journal of Machine Learning Research 3, 1371–1382 (2003)

    MATH  Google Scholar 

  15. Witten, I.H., Frank, E.: Data Mining – Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann Publishers, USA (2000)

    Google Scholar 

  16. Yang, Y., Pederson, J.: A comparative study on feature selection in text categorization. In: Proc. of the Fourteenth International Conference on Machine Learning (1997)

    Google Scholar 

  17. Yu, L., Liu, H.: Efficient Feature Selection via Analysis of Relevance and Redundancy. Journal of Machine Learning Research (5), 1205–1224 (2004)

    MathSciNet  MATH  Google Scholar 

  18. Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: A New Data Clustering Algorithm and Its Applications. Data Mining and Knowledge Discovery 1(2), 141–182 (1997)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Covões, T.F., Hruschka, E.R., de Castro, L.N., Santos, Á.M. (2009). A Cluster-Based Feature Selection Approach. In: Corchado, E., Wu, X., Oja, E., Herrero, Á., Baruque, B. (eds) Hybrid Artificial Intelligence Systems. HAIS 2009. Lecture Notes in Computer Science(), vol 5572. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02319-4_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02319-4_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02318-7

  • Online ISBN: 978-3-642-02319-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics