Advertisement

Default Clustering with Conceptual Structures

  • Julien Velcin
  • Jean-Gabriel Ganascia
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4380)

Abstract

This paper describes a theoretical framework for inducing knowledge from incomplete data sets. The general framework can be used with any formalism based on a lattice structure. It is illustrated within two formalisms: the attribute-value formalism and Sowa’s conceptual graphs. The induction engine is based on a non-supervised algorithm called default clustering which uses the concept of stereotype and the new notion of default subsumption, inspired by the default logic theory. A validation using artificial data sets and an application concerning the extraction of stereotypes from newspaper articles are given at the end of the paper.

Keywords

Tabu Search Conceptual Structure Description Space Default Rule Conceptual Graph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Corruble, V.: Une approche inductive de la découverte en médecine : les cas du scorbut et de la lèpre, thèse de l’Université Pierre et Marie Curie, rapport interne LAFORIA TH96/18 (1996)Google Scholar
  2. 2.
    Corruble, V., Ganascia, J.-G.: Induction and the discovery of the causes of scurvy: a computational reconstruction. Artificial Intelligence Journal 91(2), 205–223 (1997)zbMATHCrossRefGoogle Scholar
  3. 3.
    Velcin, J.: Reconstruction rationnelle des mentalités collectives: deux études sur la xénophobie, DEA report, Internal Report University Paris VI, Paris (2002)Google Scholar
  4. 4.
    Michalski, R.S.: Knowledge acquisition through conceptual clustering: A theoretical framework and algorithm for partitioning data into conjunctive concepts. International Journal of Policy Analysis and Information Systems 4, 219–243 (1980)MathSciNetGoogle Scholar
  5. 5.
    Newgard, C.D., Lewis, R.J.: The Imputation of Missing Values in Complex Sampling Databases: An Innovative Approach. Academic Emergency Medicine 9(5484), Society for Academic Emergency Medicine (2002)Google Scholar
  6. 6.
    Little, R., Rubin, D.: Statistical analysis with missing data. Wiley-Interscience publication, New York, NY, USA (2002)zbMATHGoogle Scholar
  7. 7.
    Reiter, R.: A logic for default reasoning. Artificial Intelligence 13, 81–132 (1980)zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Wittgenstein, L.: Philosophical Investigations. Blackwell, Oxford, UK (1953)Google Scholar
  9. 9.
    Ganascia, J.-G.: Rational Reconstruction of Wrong Theories. In: P.V.-V. Hajek, L., Westerstahl, D. (eds.) Proceedings of the LMPS-03, Elsevier, North-Holland, Amsterdam (2004)Google Scholar
  10. 10.
    Moscovici, S.: La psychanalyse: son image et son public. PUF, Paris (1961)Google Scholar
  11. 11.
    Fan, D.: Predictions of public opinion from the mass media: Computer content analysis and mathematical modeling. Greenwood Press, New York, NY (1988)Google Scholar
  12. 12.
    Huang, C.-C., Lee, H.-M.: A Grey-Based Nearest Neighbor Approach for Missing Attribute Value Prediction. In: Applied Intelligence, vol. 20, pp. 239–252. Kluwer Academic Publishers, Norwell, MA, USA (2004)Google Scholar
  13. 13.
    Ghahramani, Z., Jordan, M.-I.: Supervised learning from incomplete data via an EM approach. In: Ghahramani, Z., Jordan, M.-I. (eds.) Advances in Neural Information Processing Systems, vol. 6, Morgan Kaufmann Publishers, San Francisco (1994)Google Scholar
  14. 14.
    Huang, Z.: A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining, In: DMKD (1997)Google Scholar
  15. 15.
    Dempster, A.P. et al.: Maximum likelihood from incomplete data via the EM algorithm. Royal Statistical Society. Series B. (Methodological) 39(1), 1–38 (1977)zbMATHMathSciNetGoogle Scholar
  16. 16.
    Figueroa, A., Borneman, J., Jiang, T.: Clustering binary fingerprint vectors with missing values for DNA array data analysis (2003)Google Scholar
  17. 17.
    Sarkar, M., Leong, T.Y.: Fuzzy K-means clustering with missing values, In: Proc AMIA Symp. PubMed, pp.588–92 (2001)Google Scholar
  18. 18.
    McDermott, D., Doyle, J.: Nonmonotonic logic 1. Artificial Intelligence 13, 41–72 (1980)zbMATHCrossRefMathSciNetGoogle Scholar
  19. 19.
    McCarthy, J.: Circumscription: a form of non-monotonic reasoning. Artificial Intelligence 13, 171–172 (1980)zbMATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    Fisher, D.H.: Knowledge Acquisition Via Incremental Conceptual Clustering. Machine Learning 2, 139–172 (1987)Google Scholar
  21. 21.
    Gennari, J.H.: An experimental study of concept formation, Doctoral dissertation, Department of Information & Computer Science, University of California, Irvine (1990)Google Scholar
  22. 22.
    Rosch, E.: Cognitive representations of semantic categories. Journal of Experimental Psychology: General 104, 192–232 (1975)CrossRefGoogle Scholar
  23. 23.
    Rosch, E.: Principles of categorization. In: Cognition and Categorization, pp. 27–48. Lawrence Erlbaum, Hillsdale (1978)Google Scholar
  24. 24.
    Sowa, J.F.: Conceptual Structures: Information Processing in Mind and Machine. In: The Systems Programming Series, Addison-Wesley Publishing Company, Massachusetts (1984)Google Scholar
  25. 25.
    Lippman, W.: Public Opinion, Ed. MacMillan, NYC (1922)Google Scholar
  26. 26.
    Putnam, H.: The Meaning of ’Meaning’. In: Mind, Language, and Reality, pp. 215–271. Cambridge University Press, Cambridge (1975)Google Scholar
  27. 27.
    Rich, E.: User Modeling via Stereotypes. International Journal of Cognitive Science 3, 329–354 (1979)CrossRefGoogle Scholar
  28. 28.
    Amossy, R., Herschberg Pierrot, A.: Stéréotypes et clichés : langues, discours, société. Nathan Université (1997)Google Scholar
  29. 29.
    Al-Sultan, K.: A Tabu Search Approach to the Clustering Problem. Pattern Recognition 28(9), 1443–1451 (1995)CrossRefGoogle Scholar
  30. 30.
    Ng, M.K., Wong, J.C.: Clustering categorical data sets using tabu search techniques. Pattern Recognition 35(12), 2783–2790 (2002)zbMATHCrossRefGoogle Scholar
  31. 31.
    Glover, F., Laguna, M.: Tabu Search. Kluwer Academic Publishers, Norwell, MA, USA (1997)zbMATHGoogle Scholar
  32. 32.
    Guha, S., Rastogi, R., Shim, K.: ROCK: A Robust Clustering Algorithm for Categorical Attributes. Information Systems 25(5), 345–366 (2000)CrossRefGoogle Scholar
  33. 33.
    Zhong, J., Zhu, H., Li, J., Yu, Y.: Conceptual Graph Matching for Semantic Search. In: Priss, U., Corbett, D.R., Angelova, G. (eds.) ICCS 2002. LNAI (LNCS), vol. 2393, pp. 92–106. Springer, Berlin Heidelberg (2002)Google Scholar
  34. 34.
    Garner, S.R.: WEKA: The waikato environment for knowledge analysis, In: Proc. of the New Zealand Computer Science Research Students Conference, pp. 57–64 (1995)Google Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Julien Velcin
    • 1
  • Jean-Gabriel Ganascia
    • 1
  1. 1.LIP6, Université Paris VI, 8 rue du Capitaine Scott, 75015 ParisFrance

Personalised recommendations