Feature Subset-Wise Mixture Model-Based Clustering via Local Search Algorithm

Namkoong, Younghwan; Joo, Yongsung; Dankel, Douglas D.

doi:10.1007/978-3-642-13059-5_15

Younghwan Namkoong²¹,
Yongsung Joo²² &
Douglas D. Dankel II²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6085))

Included in the following conference series:

Canadian Conference on Artificial Intelligence

2567 Accesses

Abstract

In clustering, most feature selection approaches account for all the features of the data to identify a single common feature subset contributing to the discovery of the interesting clusters. However, many data can comprise multiple feature subsets, where each feature subset corresponds to the meaningful clusters differently. In this paper, we attempt to reveal a feature partition consisting of multiple non-overlapped feature blocks that each one fits a finite mixture model. To find the desired feature partition, we used a local search algorithm based on a Simulated Annealing technique. During the process of searching for the optimal feature partition, reutilization of the previous estimation results has been adopted to reduce computational cost.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Akaike, H.: A new look at the statistical model identification. IEEE Transactions on Automatic Control 19(6), 716–723 (1974)
Article MATH MathSciNet Google Scholar
Booth, J.G., Casella, G., Hobert, J.P.: Clustering using objective functions and stochastic search. Journal of the Royal Statistical Society B 70(1), 119–139 (2008)
Article MATH MathSciNet Google Scholar
Constantinopoulos, C., Titsias, M.K., Likas, A.: Bayesian feature and model selection for Gaussian mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 28(6), 1013–1018 (2006)
Article Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum-likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B 39, 1–38 (1977)
MATH MathSciNet Google Scholar
Han, J., Kamber, M.: Data mining: concepts and techniques. Morgan Kaufmann Publishers, San Francisco (2001)
Google Scholar
Jain, A.K., Zongker, D.E.: Feature selection: Evaluation, application, and small sample performance. IEEE Trans. Pattern Anal. Mach. Intell. 19(2), 153–158 (1997)
Article Google Scholar
Green, P.J.: Reversible jump Markov Chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 711–732 (1995)
Article MATH MathSciNet Google Scholar
Kirkpatrick, S., Gelatt Jr., C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220, 671–680 (1983)
Article MathSciNet Google Scholar
Kullback, S., Leibler, R.A.: On information and sufficiency. The Annals of Mathematical Statistics 22(1), 79–86 (1951)
Article MATH MathSciNet Google Scholar
Law, M.H.C., Figueiredo, M.A.T., Jain, A.K.: Simultaneous feature selection and clustering using mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 26(9), 1154–1166 (2004)
Article Google Scholar
Liu, T., Liu, S., Chen, Z., Ma, W.-Y.: An evaluation on feature selection for text clustering. In: Fawcett, T., Mishra, N. (eds.) ICML. AAAI Press, Menlo Park (2003)
Google Scholar
Liu, H., Motoda, H.: Computational Methods for Feature Selection. Chapman & Hall/CRC, Boca Raton (2007)
Google Scholar
Luss, R., dAspremont, A.: Clustering and feature selection using sparse principal component analysis. CoRR abs/0707.0701 (2007)
Google Scholar
Neal, R.: Markov chain sampling methods for Dirichlet process mixture models. Technical Report 9815, Department of statistics, University of Toronto (1998)
Google Scholar
Roberts, S.J., Holmes, C., Denison, D.: Minimum-entropy data partitioning using reversible jump markov chain monte carlo. IEEE Trans. Pattern Anal. Mach. Intell. 23(8), 909–914 (2001)
Article Google Scholar
Rota, G.-C.: The Number of Partitions of a Set. American Mathematical Monthly 71(5), 498–504 (1964)
Article MATH MathSciNet Google Scholar
Sahami, M.: Using Machine Learning to Improve Information Access. Ph.D. thesis, Stanford University, CA (1998)
Google Scholar
Ueda, N., Nakano, R.: Deterministic annealing EM algorithm. Neural Networks 11(2), 271–282 (1998)
Article Google Scholar
Xu, R., Wunsch II, D.: Clustering (IEEE Press Series on Computational Intelligence). Wiley-IEEE Press (2009)
Google Scholar
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository (2007), http://archive.ics.uci.edu/ml/index.html

Download references

Author information

Authors and Affiliations

Computer and Information Science and Engineering, University of Florida, FL, 32611, USA
Younghwan Namkoong & Douglas D. Dankel II
Department of Statistics, Dongguk University, Seoul, South Korea
Yongsung Joo

Authors

Younghwan Namkoong
View author publications
You can also search for this author in PubMed Google Scholar
Yongsung Joo
View author publications
You can also search for this author in PubMed Google Scholar
Douglas D. Dankel II
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

NLP Technologies Inc., 1255 University Street, H3B 3W9, Montreal, Quebec, Canada
Atefeh Farzindar
Dalhousie University, Faculty of Computer Science, 6050 University Ave, Halifax, B3H 1W5, Nova Scotia, Canada
Vlado Kešelj

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Namkoong, Y., Joo, Y., Dankel, D.D. (2010). Feature Subset-Wise Mixture Model-Based Clustering via Local Search Algorithm. In: Farzindar, A., Kešelj, V. (eds) Advances in Artificial Intelligence. Canadian AI 2010. Lecture Notes in Computer Science(), vol 6085. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13059-5_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-13059-5_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13058-8
Online ISBN: 978-3-642-13059-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics