A Non-parametric Maximum Entropy Clustering

  • Hideitsu Hino
  • Noboru Murata
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8681)


Clustering is a fundamental tool for exploratory data analysis. Information theoretic clustering is based on the optimization of information theoretic quantities such as entropy and mutual information. Recently, since these quantities can be estimated in non-parametric manner, non-parametric information theoretic clustering gains much attention. Assuming the dataset is sampled from a certain cluster, and assigning different sampling weights depending on the clusters, the cluster conditional information theoretic quantities are estimated. In this paper, a simple clustering algorithm is proposed based on the principle of maximum entropy. The algorithm is experimentally shown to be comparable to or outperform conventional non-parametric clustering methods.


Information Theoretic Clustering Non-parametric Likelihood and Entropy estimator 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Jain, A.K., et al.: Data clustering: A review. ACM Computing Surveys 31(3), 264–323 (1999)CrossRefGoogle Scholar
  2. 2.
    McLachlan, G.J., Basford, K.E.: Mixture Models: Inference and Applications to Clustering. Marcel Dekker, New York (1988)zbMATHGoogle Scholar
  3. 3.
    Ng, A.Y., et al.: On spectral clustering: Analysis and an algorithm. In: NIPS (2001)Google Scholar
  4. 4.
    Gokcay, E., Principe, J.C.: Information theoretic clustering. IEEE Trans. Pattern Anal. Mach. Intell. 24(2), 158–171 (2002)CrossRefGoogle Scholar
  5. 5.
    Faivishevsky, L., Goldberger, J.: A Nonparametric Information Theoretic Clustering Algorithm. In: ICML (2010)Google Scholar
  6. 6.
    Gomes, R., et al.: Discriminative clustering by regularized information maximization. In: NIPS, pp. 775–783 (2010)Google Scholar
  7. 7.
    Arthur, D., Vassilvitskii, S.: K-means++: the advantages of careful seeding. In: SODA (2007)Google Scholar
  8. 8.
    Zelnik-Manor, L., Perona, P.: Self-tuning spectral clustering. In: NIPS (2005)Google Scholar
  9. 9.
    Sugiyama, M., et al.: On information-maximization clustering: Tuning parameter selection and analytic solution. In: ICML (2011)Google Scholar
  10. 10.
    Müller, A.C., Nowozin, S., Lampert, C.H.: Information Theoretic Clustering Using Minimum Spanning Trees. In: Pinz, A., Pock, T., Bischof, H., Leberl, F. (eds.) DAGM and OAGM 2012. LNCS, vol. 7476, pp. 205–215. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  11. 11.
    Hino, H., Murata, N.: Information estimators for weighted observations. Neural Networks 46, 260–275 (2013)CrossRefGoogle Scholar
  12. 12.
    Kozachenko, L.F., Leonenko, N.N.: Sample estimate of entropy of a random vector. Problems of Information Transmission 23, 95–101 (1987)zbMATHMathSciNetGoogle Scholar
  13. 13.
    Milligan, G.W., Cooper, M.C.: A study of the comparability of external criteria for hierarchical cluster analysis. Multivariate Behavioral Research (1986)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Hideitsu Hino
    • 1
  • Noboru Murata
    • 2
  1. 1.University of TsukubaTsukubaJapan
  2. 2.Waseda UniversityShinjuku-kuJapan

Personalised recommendations