Abstract
In this work we propose a new information-theoretic clustering algorithm that infers cluster memberships by direct optimization of a non-parametric mutual information estimate between data distribution and cluster assignment. Although the optimization objective has a solid theoretical foundation it is hard to optimize. We propose an approximate optimization formulation that leads to an efficient algorithm with low runtime complexity. The algorithm has a single free parameter, the number of clusters to find. We demonstrate superior performance on several synthetic and real datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Banerjee, A., Merugu, S., Dhillon, I., Ghosh, J.: Clustering with Bregman divergences. Journal of Machine Learning Research 6 (2005)
Barber, F.: Kernelized infomax clustering. In: Neural Information Processing Systems (2006)
Curtin, R.R., Cline, J.R., Slagle, N.P., Amidon, M.L., Gray, A.G.: MLPACK: A scalable C++ machine learning library. In: BigLearning: Algorithms, Systems, and Tools for Learning at Scale (2011)
Dhillon, I., Mallela, S., Kumar, R.: A divisive information theoretic feature clustering algorithm for text classification. Journal of Machine Learning Research 3 (2003)
Faivishevsky, L., Goldberger, J.: A nonparametric information theoretic clustering algorithm. In: International Conference on Machine Learning (2010)
Gokcay, E., Principe, J.: Information theoretic clustering. Pattern Analysis and Machine Intelligence 24 (2002)
Gomes, R., Krause, A., Perona, P.: Discriminative clustering by regularized information maximization. In: Neural Information Processing Systems (2010)
Gower, J., Ross, G.: Minimum spanning trees and single linkage cluster analysis. Applied Statistics (1969)
Grygorash, O., Zhou, Y., Jorgensen, Z.: Minimum spanning tree based clustering algorithms. In: International Conference on Tools with Artificial Intelligence (2006)
Hero III, A., Michel, O.: Asymptotic theory of greedy approximations to minimal k-point random graphs. Information Theory 45 (1999)
Hubert, L., Arabie, P.: Comparing partitions. Journal of Classification 2 (1985)
Kamvar, K., Sepandar, S., Klein, K., Dan, D., Manning, M., Christopher, C.: Spectral learning. In: International Joint Conference of Artificial Intelligence (2003)
Lloyd, S.: Least squares quantization in PCM. Information Theory 28 (1982)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Berkeley Symposium on Mathematical Statistics and Probability (1967)
March, W.B., Ram, P., Gray, A.G.: Fast Euclidean minimum spanning tree: algorithm, analysis, applications. In: International Conference on Knowledge Discovery and Data Mining (2010)
Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Neural Information Processing Systems (2002)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: Machine learning in python. Journal of Machine Learning Research 12 (2011)
Pettis, K., Bailey, T., Jain, A., Dubes, R.: An intrinsic dimensionality estimator from near-neighbor information. Pattern Analysis and Machine Intelligence 1 (1979)
Rand, W.: Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association (1971)
Shi, J., Malik, J.: Normalized cuts and image segmentation. Pattern Analysis and Machine Intelligence 22 (2000)
Slonim, N., Tishby, N.: Agglomerative information bottleneck. In: Neural Information Processing Systems (1999)
Strehl, A., Ghosh, J.: Cluster ensembles–a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3 (2003)
Zahn, C.: Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Transactions on Computers 100 (1971)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Müller, A.C., Nowozin, S., Lampert, C.H. (2012). Information Theoretic Clustering Using Minimum Spanning Trees. In: Pinz, A., Pock, T., Bischof, H., Leberl, F. (eds) Pattern Recognition. DAGM/OAGM 2012. Lecture Notes in Computer Science, vol 7476. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32717-9_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-32717-9_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32716-2
Online ISBN: 978-3-642-32717-9
eBook Packages: Computer ScienceComputer Science (R0)