Skip to main content

Fuzzy Cluster Validation Using the Partition Negentropy Criterion

  • Conference paper
Artificial Neural Networks – ICANN 2009 (ICANN 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5769))

Included in the following conference series:

Abstract

We introduce the Partition Negentropy Criterion (PNC) for cluster validation. It is a cluster validity index that rewards the average normality of the clusters, measured by means of the negentropy, and penalizes the overlap, measured by the partition entropy. The PNC is aimed at finding well separated clusters whose shape is approximately Gaussian. We use the new index to validate fuzzy partitions in a set of synthetic clustering problems, and compare the results to those obtained by the AIC, BIC and ICL criteria. The partitions are obtained by fitting a Gaussian Mixture Model to the data using the EM algorithm. We show that, when the real clusters are normally distributed, all the criteria are able to correctly assess the number of components, with AIC and BIC allowing a higher cluster overlap. However, when the real cluster distributions are not Gaussian (i.e. the distribution assumed by the mixture model) the PNC outperforms the other indices, being able to correctly evaluate the number of clusters while the other criteria (specially AIC and BIC) tend to overestimate it.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Everitt, B., Landau, S., Leese, M.: Cluster Analysis. Hodder Arnold, London (2001)

    MATH  Google Scholar 

  2. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. J. Royal Statistical Soc. B 39, 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  3. Gordon, A.D.: Cluster Validation. In: Hayashi, C., Ohsumi, N., Yajima, K., Tanaka, Y., Bock, H.H., Baba, Y. (eds.) Data Science, Classification and Related Methods, pp. 22–39. Springer, New York (1998)

    Chapter  Google Scholar 

  4. Bezdek, J.C., Pal, R.N.: Some New Indexes of Cluster Validity. IEEE Trans. Systems, Man and Cybernetics B 28(3), 301–315 (1998)

    Article  Google Scholar 

  5. Pakhira, M.K., Bandyopadhyay, S., Maulik, U.: Validity Index for Crisp and Fuzzy Clusters. Pattern Recognition 37(3), 487–501 (2004)

    Article  MATH  Google Scholar 

  6. Bouguessa, M., Wang, S., Sun, H.: An Objective Approach to Cluster Validation. Pattern Recognition Letters 27(13), 1419–1430 (2006)

    Article  Google Scholar 

  7. Bozdogan, H.: Choosing the Number of Component Clusters in the Mixture-Model Using a New Information Complexity Criterion of the Inverse-Fisher Information Matrix. In: Opitz, O., Lausen, B., Klar, R. (eds.) Data Analysis and Knowledge Organization, pp. 40–54. Springer, Heidelberg (1993)

    Google Scholar 

  8. Biernacki, C., Celeux, G., Govaert, G.: An Improvement of the NEC Criterion for Assessing the Number of Clusters in a Mixture Model. Pattern Recognition Letters 20(3), 267–272 (1999)

    Article  MATH  Google Scholar 

  9. Geva, A.B., Steinberg, Y., Bruckmair, S., Nahum, G.: A Comparison of Cluster Validity Criteria for a Mixture of Normal Distributed Data. Pattern Recognition Letters 21(6-7), 511–529 (2000)

    Article  Google Scholar 

  10. Pal, N.R., Biswas, J.: Cluster Validation Using Graph Theoretic Concepts. Pattern Recognition 30(6), 847–857 (1997)

    Article  Google Scholar 

  11. Hathaway, R.J., Bezdek, J.C.: Visual Cluster Validity for Prototype Generator Clustering Models. Pattern Recognition Letters 24(9-10), 1563–1569 (2003)

    Article  MATH  Google Scholar 

  12. Ding, Y., Harrison, R.F.: Relational Visual Cluster Validity (RVCV). Pattern Recognition Letters 28(15), 2071–2079 (2007)

    Article  Google Scholar 

  13. Richardson, S., Green, P.: On Bayesian Analysis of Mixtures with Unknown Number of Components. J. Royal Statistical Soc. 59, 731–792 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  14. Rasmussen, C.: The Infinite Gaussian Mixture Model. In: Solla, S., Leen, T., Müller, K.-R. (eds.) Advances in Neural Information Processing Systems, vol. 12, pp. 554–560. MIT Press, Cambridge (2000)

    Google Scholar 

  15. Neal, R.M.: Markov Chain Sampling Methods for Dirichlet Process Mixture Models. J. Computational and Graphical Statistics 9(2), 249–265 (2000)

    MathSciNet  Google Scholar 

  16. Figueiredo, M.A.T., Jain, A.K.: Unsupervised Learning of Finite Mixture Models. IEEE Trans. Pattern Analysis and Machine Intelligence 24(3), 381–396 (2002)

    Article  Google Scholar 

  17. Akaike, H.: A new look at the statistical model identification. IEEE Trans. Automatic Control 19, 716–723 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  18. Schwartz, G.: Estimating the Dimension of a Model. Annals of Statistics 6, 461–464 (1978)

    Article  MathSciNet  Google Scholar 

  19. Fraley, C., Raftery, A.: How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. Technical Report 329, Dept. Statistics, Univ. Washington, Seattle, WA (1998)

    Google Scholar 

  20. Bezdek, J.C., Li, W.Q., Attikiouzel, Y., Windham, M.: A Geometric Approach to Cluster Validity for Normal Mixtures. Soft Computing 1, 166–179 (1997)

    Article  Google Scholar 

  21. Biernacki, C., Celeux, G., Govaert, G.: Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood. IEEE Trans. Pattern Analysis Machine Intelligence 22(7), 719–725 (2000)

    Article  Google Scholar 

  22. Samé, A., Ambroise, C., Govaert, G.: An Online Classification EM Algorithm Based on the Mixture Model. Stat. Comput. 17, 209–218 (2007)

    Article  MathSciNet  Google Scholar 

  23. Comon, P.: Independent Component Analysis, a New Concept? Signal Processing 36(3), 287–314 (1994)

    Article  MATH  Google Scholar 

  24. Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley, New York (1991)

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lago-Fernández, L.F., Sánchez-Montañés, M., Corbacho, F. (2009). Fuzzy Cluster Validation Using the Partition Negentropy Criterion. In: Alippi, C., Polycarpou, M., Panayiotou, C., Ellinas, G. (eds) Artificial Neural Networks – ICANN 2009. ICANN 2009. Lecture Notes in Computer Science, vol 5769. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04277-5_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04277-5_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04276-8

  • Online ISBN: 978-3-642-04277-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics