Abstract
Unsupervised clustering algorithms aims to synthesize a dataset such that similar objects are grouped together whereas dissimilar ones are separated. In the context of data analysis, it is often interesting to have tools for interpreting the result. There are some criteria for symbolic attributes which are based on the frequency estimation of the attribute-value pairs. Our point of view is to integrate the construction of the interpretation inside the clustering process. To do this, we propose an algorithm which provides two partitions, one on the set of objects and the second on the set of attribute-value pairs such that those two partitions are the most associated ones. In this article, we present a study of several functions for evaluating the intensity of this association.
Chapter PDF
Similar content being viewed by others
References
L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone. Classification and Regression Trees. Wadsworth International, California, 1984.
G. Biswas, J. Weinberg, and C. Li. Iterate: a conceptual clustering method for knowledge discovery in databases. Technical report, Departement of Computer Science, Vanderbilt university, Nashville, 1995.
G. Celeux, E. Diday, G. Govaert, Y. Lechevallier, and H. Ralambondrainy. Classification automatique des données. Dunod, paris, 1988.
G. Celeux and G. Soromenho. An entropy criterion for assessing the number of clusters in a mixture model. Journal of classification, 13:195–212, 1996.
D. H. Fisher. Knowledge acquisition via incremental conceptual clustering. Machine Learning, 2: 139–172, 1987.
D. H. Fisher. Iterative optimization and simplification of hierarchical clusterings. Journal of Artificial Intelligence Research, 4:147–180, 1996.
L. A. Goodman and W. H. Kruskal. Measures of association for cross classification. Journal of the American Statistical Association, 49:732–764, 1954.
G. Govaert. Classification simultanée de tableaux binaires. In E. Diday, M. Jambu, L. Lebart, J. Pages, and R. Tomassone, editors, Data analysis and informatics III, pages 233–236. North Holland, 1984.
A. K. Jain and R. C. Dubes. Algorithms for clustering data. Prentice Hall, Englewood cliffs, New Jersey, 1988.
I.C. Lerman and J. F. P. da Costa. Coefficients d’association et variables à très grand nombre de catégories dans les arbres de décision: application à l’identification de la structure secondaire d’une protéine. Technical Report 2803, INRIA, février 1996.
G. Matthews and J. Hearne. Clustering without a metric. IEEE Transaction on pattern analysis and machine intelligence, 13(2):175–184, 1991.
C. Robardet and F. Feschet. A new methodology to compare clustering algorithms. In H. Meng K. S. Leung, L. Chan, editor, Intelligent data engineering and automated learning-IDEAL 2000, number 1983 in LNCS. Springer-Verlag, 2000.
L. Talavera and J. Béjar. Generality-based conceptual clustering with probabilistic concepts. IEEE Transactions on pattern analysis and machine intelligence, 23(2):196–206, 2001.
L. Wehenkel. On uncertainty measures used for decision tree induction. In Info. Proc. and Manag. of Uncertainty, pages 413–418, 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Robardet, C., Feschet, F. (2001). Comparison of Three Objective Functions for Conceptual Clustering. In: De Raedt, L., Siebes, A. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 2001. Lecture Notes in Computer Science(), vol 2168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44794-6_33
Download citation
DOI: https://doi.org/10.1007/3-540-44794-6_33
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42534-2
Online ISBN: 978-3-540-44794-8
eBook Packages: Springer Book Archive