A Feature Selection Wrapper for Mixtures

Figueiredo, Mário A. T.; Jain, Anil K.; Law, Martin H.

doi:10.1007/978-3-540-44871-6_27

A Feature Selection Wrapper for Mixtures

Mário A. T. Figueiredo⁵,
Anil K. Jain⁶ &
Martin H. Law⁶

Conference paper
First Online: 01 January 2003

931 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2652))

Abstract

We propose a feature selection approach for clustering which extends Koller and Sahami’s mutual-information-based criterion to the unsupervised case. This is achieved with the help of a mixture-based model and the corresponding expectation-maximization algorithm. The result is a backward search scheme, able to sort the features by order of relevance. Finally, an MDL criterion is used to prune the sorted list of features, yielding a feature selection criterion. The proposed approach can be classified as a wrapper, since it wraps the mixture estimation algorithm in an outer layer that performs feature selection. Preliminary experimental results show that the proposed method has promising performance.

Work partially supported by the Foundation for Science and Technology (Portugal), grant POSI/33143/SRI/2000, and the Office of Naval Research (USA), grant 00014- 01-1-0266.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Koller, D., Sahami, M.: Toward optimal feature selection. In: Proceedings of the International Conference on Machine Learning, Bari, Italy, pp. 284–292 (1996)
Google Scholar
Kohavi, R., John, G.: Wrappers for feature subset selection. Artificial Intelligence 97(1-2), 273–324 (1997)
Article Google Scholar
Jain, A.K., Zongker, D.: Feature selection: Evaluation, application, and small sample performance. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(2), 153–158 (1997)
Article Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, New York (2001)
Book Google Scholar
Jain, A.K., Dubes, R.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988)
MATH Google Scholar
Fraley, C., Raftery, A.: Model based clustering, discriminant analysis, and density estimation. Journal of the American Statist. Assoc. 97, 611–631 (2002)
Article MathSciNet Google Scholar
McLachlan, G., Basford, K.: Mixture Models: Inference and Application to Clustering. Marcel Dekker, New York (1988)
MATH Google Scholar
McLachlan, G., Peel, D.: Finite Mixture Models. John Wiley & Sons, New York (2000)
Book Google Scholar
Figueiredo, M., Jain, A.K.: Unsupervised learning of finite mixture models. IEEE Trans. on Pattern Analysis and Machine Intelligence 24, 381–396 (2002)
Article Google Scholar
Dy, J., Brodley, C.: Feature subset selection and order identification for unsupervised learning. In: Proc. 17th International Conf. on Machine Learning, pp. 247–254. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Vaithyanathan, S., Dom, B.: Generalized model selection for unsupervised learning in high dimensions. In: Solla, S., Leen, T., Müller, K.R. (eds.) Advances in Neural Information Processing Systems 12. MIT Press, Cambridge (2000)
Google Scholar
Dash, M., Liu, H.: Feature selection for clustering. In: Proc. of Pacific-Asia Conference on Knowledge Discovery and Data Mining (2000)
Google Scholar
Kim, Y., Street, W., Menczer, F.: Feature Selection in Unsupervised Learning via Evolutionary Search. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2000)
Google Scholar
Devaney, M., Ram, A.: Efficient feature selection in conceptual clustering. In: International Conference on Machine Learning, pp. 92–97 (1997)
Google Scholar
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood estimation from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B 39, 1–38 (1977)
MathSciNet MATH Google Scholar
McLachlan, G., Krishnan, T.: The EM Algorithm and Extensions. John Wiley & Sons, New York (1997)
MATH Google Scholar
Cover, T., Thomas, J.: Elements of Information Theory. John Wiley & Sons, New York (1991)
Book Google Scholar
Trunk, G.: A problem of dimensionality: A simple example. IEEE Trans. on Pattern Analysis and Machine Intelligence 1(3), 306–307 (1979)
Article Google Scholar
Rissanen, J.: Stochastic Complexity in Stastistical Inquiry. World Scientific, Singapore (1989)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Instituto de Telecomunicações, and Departamento de Engenharia Electrotécnica e de Computadores, Instituto Superior Técnico, 1049-001, Lisboa, Portugal
Mário A. T. Figueiredo
Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA
Anil K. Jain & Martin H. Law

Authors

Mário A. T. Figueiredo
View author publications
You can also search for this author in PubMed Google Scholar
Anil K. Jain
View author publications
You can also search for this author in PubMed Google Scholar
Martin H. Law
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Unitat de Gràfics i Visió per Ordinador Departament de Ciències Matemàtiques i Informàtica, Universitat de les Illes Balears Edifici Anselm Turmeda, Ctra. de Valldemossa km 7,5, 07122, Palma de Mallorca, Spain
Francisco José Perales
FEUP - Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias, 4200-465, Porto, Portugal
Aurélio J. C. Campilho
Departamento de Ciencias da la Computacíon e I.A., Universidad de Granada, E.T. S. Ing. Informática, 18071, Granada, Spain
Nicolás Pérez de la Blanca
Dept. System Engineering and Automation, Universitat Politècnica de Catalunya (UPC) Barcelona, Spain
Alberto Sanfeliu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Figueiredo, M.A.T., Jain, A.K., Law, M.H. (2003). A Feature Selection Wrapper for Mixtures. In: Perales, F.J., Campilho, A.J.C., de la Blanca, N.P., Sanfeliu, A. (eds) Pattern Recognition and Image Analysis. IbPRIA 2003. Lecture Notes in Computer Science, vol 2652. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-44871-6_27

Download citation

DOI: https://doi.org/10.1007/978-3-540-44871-6_27
Published: 18 September 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40217-6
Online ISBN: 978-3-540-44871-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics