A Cluster-Based Feature Selection Approach

Covões, Thiago F.; Hruschka, Eduardo R.; de Castro, Leandro N.; Santos, Átila M.

doi:10.1007/978-3-642-02319-4_20

A Cluster-Based Feature Selection Approach

Thiago F. Covões²³,
Eduardo R. Hruschka²³,
Leandro N. de Castro²⁴ &
…
Átila M. Santos²⁴

Conference paper

1827 Accesses
13 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5572))

Abstract

This paper proposes a filter-based method for feature selection. The filter is based on the partitioning of the feature space into clusters of similar features. The number of clusters and, consequently, the cardinality of the subset of selected features, is automatically estimated from the data. Empirical results illustrate the performance of the proposed algorithm, which in general has obtained competitive results in terms of classification accuracy when compared to a state of the art algorithm for feature selection, but with more modest computing time requirements.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Arabie, P., Hubert, L.J.: An Overview of Combinatorial Data Analysis. In: Arabie, P., Hubert, L.J., DeSoete, G. (eds.) Clustering and Classification. World Scientific, Singapore (1999)
Google Scholar
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. University of California, Irvine, http://www.ics.uci.edu/~mlearn/MLRepository.htm
Au, W., Chan, K.C.C., Wong, A.K.C., Wang, Y.: Attribute Clustering for Grouping, Selection, and Classification of Gene Expression Data. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2(2), 83–101 (2005)
Article Google Scholar
Everitt, B.S., Landau, S., Leese, M.: Cluster Analysis. Arnold Publishers, London (2001)
MATH Google Scholar
Guyon, I., Elisseeff, A.: An Introduction to Variable and Feature Selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
MATH Google Scholar
Hruschka, E.R., Campello, R.J.G.B., de Castro, L.N.: Evolving Clusters in Gene-Expression Data. Information Sciences 176(13), 1898–1927 (2006)
Article MathSciNet Google Scholar
John, G., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Proc. of the Eleventh Int. Conf. on Machine Learning. Morgan Kaufmann, San Francisco (1994)
Google Scholar
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data – An Introduction to Cluster Analysis. Wiley Series in Probability and Mathematical Statistics (1990)
Google Scholar
Kohavi, R., John, G.H.: Wrappers for Feature Subset Selection. Artificial Intelligence 97(1-2), 273–324 (1997)
Article MATH Google Scholar
Koller, D., Sahami, M.: Toward optimal feature selection. In: Proc. of the 13th Int. Conf. on Machine Learning, pp. 284–292 (1996)
Google Scholar
Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic Publishers, Dordrecht (1998)
Book MATH Google Scholar
Liu, H., Yu, L.: Toward Integrating Feature Selection Algorithms for Classification and Clustering. IEEE Transactions on Knowledge and Data Engineering 17(3), 1–12 (2005)
Article Google Scholar
Mitra, P., Murthy, C.A., Pal, S.K.: Unsupervised Feature Selection using Feature Similarity. IEEE Trans. on Pattern Analysis & Machine Intelligence 24(4), 301–312 (2002)
Article Google Scholar
Reunanen, J.: Overfitting in Making Comparisons Between Variable Selection Methods. Journal of Machine Learning Research 3, 1371–1382 (2003)
MATH Google Scholar
Witten, I.H., Frank, E.: Data Mining – Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann Publishers, USA (2000)
Google Scholar
Yang, Y., Pederson, J.: A comparative study on feature selection in text categorization. In: Proc. of the Fourteenth International Conference on Machine Learning (1997)
Google Scholar
Yu, L., Liu, H.: Efficient Feature Selection via Analysis of Relevance and Redundancy. Journal of Machine Learning Research (5), 1205–1224 (2004)
MathSciNet MATH Google Scholar
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: A New Data Clustering Algorithm and Its Applications. Data Mining and Knowledge Discovery 1(2), 141–182 (1997)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of São Paulo (USP), Brazil
Thiago F. Covões & Eduardo R. Hruschka
Catholic University of Santos (UniSantos), Brazil
Leandro N. de Castro & Átila M. Santos

Authors

Thiago F. Covões
View author publications
You can also search for this author in PubMed Google Scholar
Eduardo R. Hruschka
View author publications
You can also search for this author in PubMed Google Scholar
Leandro N. de Castro
View author publications
You can also search for this author in PubMed Google Scholar
Átila M. Santos
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Grupo de Investigación GICAP, Área de Lenguajes Higher Polytechnic School, Universidad de Burgos, Burgos, Spain
Emilio Corchado
Department of Computer Science, University of Vermont, 33 Colchester Avenue, Burlington, VT, USA
Xindong Wu
Computer and Information Science, Helsinki University of Technology, P.O. Box 5400, 02015 HUT, Finland
Erkki Oja
Grupo de Investigación GICAP, Área de Lenguajes y Sistemas Informáticos, Departamento de Ingeniería Civil, Escuela Politécnica Superior, Universidad de Burgos, Campus Vena, Francisco de Vitoria, 09006, Burgos, Spain
Álvaro Herrero & Bruno Baruque &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Covões, T.F., Hruschka, E.R., de Castro, L.N., Santos, Á.M. (2009). A Cluster-Based Feature Selection Approach. In: Corchado, E., Wu, X., Oja, E., Herrero, Á., Baruque, B. (eds) Hybrid Artificial Intelligence Systems. HAIS 2009. Lecture Notes in Computer Science(), vol 5572. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02319-4_20

Download citation

DOI: https://doi.org/10.1007/978-3-642-02319-4_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02318-7
Online ISBN: 978-3-642-02319-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics