Unsupervised Non-hierarchical Entropy-based Clustering

  • M. Jardino
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


We present an unsupervised non-hierarchical clustering which realizes a partition of unlabelled objects in K non-overlapping clusters. The interest of this method rests on the convexity of the entropv-based clustering criterion which is demonstrated here. This criterion permits to reach an optimal partition independently of the initial conditions, with a step by step iterative Monte-Carlo process. Several data sets serve to illustrate the main properties of this clustering.


Gradient Descent Optimal Partition Vector Element Group Word Speech Technology 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. CELEUX G., DIDAY E. etal (1989): Classification automatique des données. Ed. Dunod.Google Scholar
  2. COVER T. and THOMAS J. (1991): Elements of Information Theory. Ed. Wiley & sonsCrossRefGoogle Scholar
  3. DUDA R.O. and HART P.E. (1973): Pattern Classification and Scene Analysis. Ed. Wiley & sonsGoogle Scholar
  4. GAUVAIN J.-L., ADDA G. and JARDINO M. (1999): Language modeling for broadcast news transcription. In Proceedings of the European Conference on Speech Technology, EuroSpeech, Budapest, 1759–1762Google Scholar
  5. JARDINO M. (1996): Multilingual stochastic n-gram class language models. In Proceedings of the IEEE-ICASSP, Atlanta.Google Scholar
  6. JARDINO M. and BEAUJARD C. (1997): Rie du Contexte dans les Modles de Langage n-classes, Application et Evaluation sur MASK et RAILTEL. In Actes des Journées Scientifiques et Techniques, 71–74.Google Scholar
  7. JELINEK F. (1998): Statistical Methods for Speech Recognition. Ed MIT Press.Google Scholar
  8. KNESER R. and NEY H. (1993): Improved Clustering Techniques for Class-Based Statistical Language Modelling. In Proceedings of the European Conference on Speech Technology, EuroSpeech, Berlin, 973–976.Google Scholar
  9. LERMAN I.C. and TALLUR B. (1980): Classification des éléments constitutifs d–une juxtaposition de tableaux de contingence. Revue de Statistique Appliquée, n28, 3, Paris.Google Scholar

Copyright information

© Springer-Verlag Berlin · Heidelberg 2000

Authors and Affiliations

  • M. Jardino
    • 1
  1. 1.Laboratoire d’Informatique pour la Mécanique et les Sciences de l’IngénieurOrsay, CedexFrance

Personalised recommendations