Symbolic Cluster Analysis

  • E. Diday
  • M. Paula Brito


The aim of this paper is to introduce the symbolic approach in data analysis and to show that it extends data analysis to more complex data which may be closer to the multidimensional reality. We introduce several kinds of symbolic objects (”events”, ”assertions”, and also ”hordes” and ”synthesis” objects) which are defined by a logical conjunction of properties concerning the variables. They can take for instance several values on a same variable and they are adapted to the case of missing and nonsense values. Background knowledge may be represented by hierarchical or pyramidal taxonomies. In clustering the problem remains to find inter-class structures such as partitions, hierarchies and pyramids on symbolic objects. Symbolic data analysis is conducted on several principles: accuracy of the representation, coherence between the kind of objects used at input and output, knowledge predominance for driving the algorithms, self-explanation of the results. We define order, union and intersection between symbolic objects and we conclude that they are organised according to an inheritance lattice. We study several properties and qualities of symbolic objects, of classes and of classifications of symbolic objects. Modal symbolic objects are then introduced. Finally, we present an algorithm to represent the clusters of a partition by modal assertions and obtain a locally optimal partition according to a given criterion.


Modal Logic Elementary Event Elementary Object Symbolic Order Symbolic Analysis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. BELSON W.A. (1959) — ”Matching and prediction on the principle of biological classification”. Applied statistics vol III.Google Scholar
  2. DALLWITZ M.J. (1984) — ”Automatic type setting of computer generated keys and descriptions”. In Data basis in systematics. Edited by R.A. Allkin and F.A. Bisby, p. 279–290, Acad. Press London and Orlando.Google Scholar
  3. DIDAY E. (1978) — ”Selection of variables and clustering” Int. Conf. on Pattern Recognition. Kyoto.Japan.Google Scholar
  4. DIDAY E., GOVAERT G., LECHEVALLIER Y., SIDI J. (1980) — ”Clustering in pattern recognition”. Proc. 5th Conf. Pattern Recognition Miami Beach FL. More complete in NATO, Bonas.JC Simon editor.Google Scholar
  5. DIDAY E., MOREAU J.V. (1984) — ”Learning hierarchical clustering from examples — application to the adaptative construction of dissimilarity indices”. Pattern Recognition Letter 2. p. 365–378.MATHCrossRefGoogle Scholar
  6. DIDAY E. (1986) — ”Une représentation visuelle des classes empiétantes: les pyramides”. RAIRO, APII Vol. 20, n o5.Google Scholar
  7. DIDAY E. (1987) — ”Orders and overlapping clusters by pyramids” in J. De Leeuw et al edit. Leiden: DSWO Press.Google Scholar
  8. DIDAY E (1988) — ”Introduction à l’analyse des données symboliques”, Actes des journées ”mboliques — Numériques” pour l’apprentissage de connaissances à partir d’observations. University Paris 9 Dauphine. CEREMADE. Edited by E. Diday and Y. Ko-dratoff.Google Scholar
  9. DIDAY E., ROY L. (1988) — ”Generating rules by symbolic data analysis and application to soil feature recognition” Actes des 8èmes Journées Internationales ”Les systèmes experts et leurs applications”, Avignon, 1988.Google Scholar
  10. DIDAY E. (1989) — ”Introduction à 1’ approche symbolique en Analyse des Données”, RAIRO, Vol. 23, n o 2. Google Scholar
  11. DUCOURNAU R., QUINQUETON J. (1989) — ”Yafool: Encore un langage objet à base de frames”, Technical report 72, INRIA — ”Y3: Yafool “Le langage à Objets”., Sema-Group report, February 1989.Google Scholar
  12. GANASCIA J. G. (1987) — ”Charade: apprentissage de bases de connaissances.” Actes des journées ”Symboliques — Numériques” pour l’apprentissage de connaissances à partir d’observations. University Paris 9 Dauphine. CEREMADE. Edited by E. Diday and Y. Kodratoff.Google Scholar
  13. GANASCIA J.G. (1987) — ”Apprentissage de connaissance par les cubes de Hilbert” — Thèse d’Etat — Université d’Orsay.Google Scholar
  14. GANTER B. (1984) — ”Two basic algorithms in concept analysis”, FB4-Preprint No831, TH Darmstadt.Google Scholar
  15. GUÉNOCHE A. (1987) — ”Propriétés caractéristiques d’une classe relativement à un contexte” Actes des journées ”Symboliques -Numériques” pour l’apprentissage de connaissances à partir d’observations. University Paris 9 Dauphine. CEREMADE. Edited by E. Diday and Y. Kodratoff.Google Scholar
  16. GUÉNOCHE A. (1989) — ”Construction du treillis de Galois d’une relation binaire”, à paraître dans Mathématiques et Sciences Humaines (M.S.H.), Paris.Google Scholar
  17. GUIGUE J. L., DUQUENNE V. (1986) — ”Familles minimales d’implications informatives resultants d’un tableau binaire”. Mathématiques et sciences humaines. 24ièmes années, 95, p. 5–18.Google Scholar
  18. HO TU Bao, DIDAY E., SUMMA M. (1987) — ”Generating rules for expert system from observations” in ”Les systèmes experts et leurs applications” 7ème journées internationales les systèmes experts et leurs applications, EC2, 269 rue de la Garenne, 92000 Nanterre (France).Google Scholar
  19. KODRATOFF Y. (1986) — ”Leçons d’apprentissage symbolique”. Cepadues-editions, 111 rue Nicolas Vauquelin — 31000 Toulouse.Google Scholar
  20. LEBBE J. (1984) — ”Manuel d’utilisation du logiciel XPER.Micro application.” Paris.Google Scholar
  21. MANAGO M. (1988) — ”Intégration de techniques numériques et symboliques en apprentissage automatique”. Thèse d’état. Université d’Orsay, LRI.Google Scholar
  22. MENESSIER M.O., DIDAY E. (1988) — ”Approche symbolique pour la prévision de séries chronologiques pseudo-périodiques”. Actes des journées ”Symboliques — Numériques” pour l’apprentissage de connaissances à partir d’observations. LRI. Université d’Orsay. Y. Kodratoff and E. Diday editors.Google Scholar
  23. MICHALSKI R. (1983) — ”Automated Construction of classifications: conceptual clustering versus numerical taxonomy”. IEEE Trans, pattern analysis and Machine intelligence. Vol. PAMI-5, 4.Google Scholar
  24. MICHALSKI R., STEPP R.E., DIDAY E. (1981) — ”A recent advances in data analysis: clustering objects into classes characterized by conjonctive concepts” Progress in Pattern Recognition vol 1. L. Kanal and A. Rosenfeld Eds.Google Scholar
  25. MORSE L.E. (1971) — “A general data format for summarizing taxonomic information” Bio Science 21 (4), p. 174.Google Scholar
  26. MORGAN J.N. and SONQUIST J.A. (1963) — ”Problems in the analysis of Survey Data a proposal” J.A.S.A. 58: 415–434.MATHGoogle Scholar
  27. QUINLAN R.J., (1983) — ”Learning efficient classification procedure and their application in chess and game.Machine Learning: an Artificial intelligence approach. Eds Michalski, Carbonell, Mitchell. Pub. TIOGA, Palo, Alto, California, p. 463.Google Scholar
  28. QUINQUETON J., SALLANTIN J. (1986) — ”CALM: contestation for argumentative learning machine”, in Machine Learning, a Guide to current research, Michalski Carbonnel Mitchell eds, Kluwer and sons.Google Scholar
  29. PANKHURST R.J. (1970) — ”A computer program for generating diagnostic keys”. Computer Journal, 13, p. 145.CrossRefGoogle Scholar
  30. RALAMBONDRAINY H. (1987) — ”GENREG: un générateur de règles à partir de données”. Actes des journées”Symboliques -Numériques” pour l’apprentissage de connaissances à partir d’observations. University Paris 9 Dauphine. CEREMADE.Google Scholar
  31. RÉGNIER S. (1965) — Sur quelques aspects mathématiques des problèmes de la classification automatique: I.C.C. Bulletin, Vol. 4.Google Scholar
  32. THAYSE A. et al (1988) — ”Approche logique de l’intelligence artificielle — 1. De la logique classique à la programmation logique”, Dunod, Paris.Google Scholar
  33. TOUATI M., DIDAY E. (1989) — ”Synthèse d’objets”, Cahiers de CEREMADE, University Paris IX-Dauphine.Google Scholar
  34. TRONCHE R., LEBBE J., VIGNES R. (1987) — ”Présentation d’un système expert d’aide au diagnostique des causes de surdité”. O.P.A. Congress.Paris.Google Scholar
  35. VIGNES R., LEBBE J. (1987) — ”Un système générateur de graphes d’identification d’objets symboliques”, Actes des journées symbolique-numérique pour l’apprentissage de connaissances à partir de données, Paris, December 1987.Google Scholar
  36. WILLE R. (1981) — ”Restructuring lattice theory: an approach based on hierarchies of concepts. Proceedings of the symposium on ordered sets. Edited by Ivan Rival.Google Scholar

Copyright information

© Springer-Verlag Berlin · Heidelberg 1989

Authors and Affiliations

  • E. Diday
    • 1
  • M. Paula Brito
    • 1
  1. 1.University Paris IX - Dauphine and INRIAFrance

Personalised recommendations