Abstract
This paper describes a theoretical framework for inducing knowledge from incomplete data sets. The general framework can be used with any formalism based on a lattice structure. It is illustrated within two formalisms: the attribute-value formalism and Sowa’s conceptual graphs. The induction engine is based on a non-supervised algorithm called default clustering which uses the concept of stereotype and the new notion of default subsumption, inspired by the default logic theory. A validation using artificial data sets and an application concerning the extraction of stereotypes from newspaper articles are given at the end of the paper.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Corruble, V.: Une approche inductive de la découverte en médecine : les cas du scorbut et de la lèpre, thèse de l’Université Pierre et Marie Curie, rapport interne LAFORIA TH96/18 (1996)
Corruble, V., Ganascia, J.-G.: Induction and the discovery of the causes of scurvy: a computational reconstruction. Artificial Intelligence Journal 91(2), 205–223 (1997)
Velcin, J.: Reconstruction rationnelle des mentalités collectives: deux études sur la xénophobie, DEA report, Internal Report University Paris VI, Paris (2002)
Michalski, R.S.: Knowledge acquisition through conceptual clustering: A theoretical framework and algorithm for partitioning data into conjunctive concepts. International Journal of Policy Analysis and Information Systems 4, 219–243 (1980)
Newgard, C.D., Lewis, R.J.: The Imputation of Missing Values in Complex Sampling Databases: An Innovative Approach. Academic Emergency Medicine 9(5484), Society for Academic Emergency Medicine (2002)
Little, R., Rubin, D.: Statistical analysis with missing data. Wiley-Interscience publication, New York, NY, USA (2002)
Reiter, R.: A logic for default reasoning. Artificial Intelligence 13, 81–132 (1980)
Wittgenstein, L.: Philosophical Investigations. Blackwell, Oxford, UK (1953)
Ganascia, J.-G.: Rational Reconstruction of Wrong Theories. In: P.V.-V. Hajek, L., Westerstahl, D. (eds.) Proceedings of the LMPS-03, Elsevier, North-Holland, Amsterdam (2004)
Moscovici, S.: La psychanalyse: son image et son public. PUF, Paris (1961)
Fan, D.: Predictions of public opinion from the mass media: Computer content analysis and mathematical modeling. Greenwood Press, New York, NY (1988)
Huang, C.-C., Lee, H.-M.: A Grey-Based Nearest Neighbor Approach for Missing Attribute Value Prediction. In: Applied Intelligence, vol. 20, pp. 239–252. Kluwer Academic Publishers, Norwell, MA, USA (2004)
Ghahramani, Z., Jordan, M.-I.: Supervised learning from incomplete data via an EM approach. In: Ghahramani, Z., Jordan, M.-I. (eds.) Advances in Neural Information Processing Systems, vol. 6, Morgan Kaufmann Publishers, San Francisco (1994)
Huang, Z.: A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining, In: DMKD (1997)
Dempster, A.P. et al.: Maximum likelihood from incomplete data via the EM algorithm. Royal Statistical Society. Series B. (Methodological) 39(1), 1–38 (1977)
Figueroa, A., Borneman, J., Jiang, T.: Clustering binary fingerprint vectors with missing values for DNA array data analysis (2003)
Sarkar, M., Leong, T.Y.: Fuzzy K-means clustering with missing values, In: Proc AMIA Symp. PubMed, pp.588–92 (2001)
McDermott, D., Doyle, J.: Nonmonotonic logic 1. Artificial Intelligence 13, 41–72 (1980)
McCarthy, J.: Circumscription: a form of non-monotonic reasoning. Artificial Intelligence 13, 171–172 (1980)
Fisher, D.H.: Knowledge Acquisition Via Incremental Conceptual Clustering. Machine Learning 2, 139–172 (1987)
Gennari, J.H.: An experimental study of concept formation, Doctoral dissertation, Department of Information & Computer Science, University of California, Irvine (1990)
Rosch, E.: Cognitive representations of semantic categories. Journal of Experimental Psychology: General 104, 192–232 (1975)
Rosch, E.: Principles of categorization. In: Cognition and Categorization, pp. 27–48. Lawrence Erlbaum, Hillsdale (1978)
Sowa, J.F.: Conceptual Structures: Information Processing in Mind and Machine. In: The Systems Programming Series, Addison-Wesley Publishing Company, Massachusetts (1984)
Lippman, W.: Public Opinion, Ed. MacMillan, NYC (1922)
Putnam, H.: The Meaning of ’Meaning’. In: Mind, Language, and Reality, pp. 215–271. Cambridge University Press, Cambridge (1975)
Rich, E.: User Modeling via Stereotypes. International Journal of Cognitive Science 3, 329–354 (1979)
Amossy, R., Herschberg Pierrot, A.: Stéréotypes et clichés : langues, discours, société. Nathan Université (1997)
Al-Sultan, K.: A Tabu Search Approach to the Clustering Problem. Pattern Recognition 28(9), 1443–1451 (1995)
Ng, M.K., Wong, J.C.: Clustering categorical data sets using tabu search techniques. Pattern Recognition 35(12), 2783–2790 (2002)
Glover, F., Laguna, M.: Tabu Search. Kluwer Academic Publishers, Norwell, MA, USA (1997)
Guha, S., Rastogi, R., Shim, K.: ROCK: A Robust Clustering Algorithm for Categorical Attributes. Information Systems 25(5), 345–366 (2000)
Zhong, J., Zhu, H., Li, J., Yu, Y.: Conceptual Graph Matching for Semantic Search. In: Priss, U., Corbett, D.R., Angelova, G. (eds.) ICCS 2002. LNAI (LNCS), vol. 2393, pp. 92–106. Springer, Berlin Heidelberg (2002)
Garner, S.R.: WEKA: The waikato environment for knowledge analysis, In: Proc. of the New Zealand Computer Science Research Students Conference, pp. 57–64 (1995)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Velcin, J., Ganascia, JG. (2007). Default Clustering with Conceptual Structures. In: Spaccapietra, S., et al. Journal on Data Semantics VIII. Lecture Notes in Computer Science, vol 4380. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70664-9_1
Download citation
DOI: https://doi.org/10.1007/978-3-540-70664-9_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70663-2
Online ISBN: 978-3-540-70664-9
eBook Packages: Computer ScienceComputer Science (R0)