A Knowledge Scout for Discovering Medical Patterns: Methodology and System SCAMP

  • Kenneth A. Kaufman
  • Ryszard S. Michalski
Conference paper
Part of the Advances in Soft Computing book series (AINSC, volume 7)


Knowledge scouts are software agents that autonomously synthesize knowledge of interest to a given user (target knowledge) by applying inductive database operators to a local or distributed dataset. This paper describes briefly a method and a scripting language for developing knowledge scouts, and then reports on experiments with a knowledge scout, SCAMP, for discovering patterns characterizing relationships among lifestyles, symptoms and diseases in a large medical database. Discovered patterns are presented in two forms: (1) attributional rules, which are expressions in attributional calculus, and (2) association graphs, which graphically and abstractly represent relations expressed by the rules. Preliminary results indicate a high potential utility of the presented methodology for deriving useful and understandable knowledge.


Association Rule Query Language Conceptual Cluster Database Operator Asthma Condition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agrawal, R., Imielinski, T. and Swami, A. (1993) Mining Association Rules between Sets of Items in Large Databases. Proceedings of the ACM SIGMOD Conference on Management of Data, 207–216.Google Scholar
  2. 2.
    Bergadano, F., Matwin S., Michalski, R.S. and Zhang, J. (1992) Learning Two-tiered Descriptions of Flexible Concepts: The POSEJDON System. Machine Learning 8, 5–43.Google Scholar
  3. 3.
    Boulicaut, J., Klemettinen, M. and Mannila, H. (1998) Querying Inductive Databases: A Case Study on the MINE RULE Operator. Proceedings of the Second European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD’98).Google Scholar
  4. 4.
    Finin, T., Fritzson, R., McKay, D. and McEntire, R. (1994) KQML as an Agent Communication Language. Proceedings of the Third International Conference on Information and Knowledge Management (CIKM’94), ACM Press.Google Scholar
  5. 5.
    Fischthal, S. (1997) A Description and User’s Guide for CLUSTER/2C++ A Program for Conjunctive Conceptual Clustering. Reports of the Machine Learning and Inference Laboratory, MLI 97–10, George Mason University, Fairfax, VA.Google Scholar
  6. 6.
    Han, J., Fu, Y., Wang, W., Chiang, J., Gong, W., Koperski, K., Li, D., Lu, Y., Rajan, A., Stefanovic, N., Xia, B. and Zaiane, O.R. (1996) DBMiner: A System for Mining Knowledge in Large Relational Databases. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 250–255.Google Scholar
  7. 7.
    Imielinski, T., Virmani, A. and Abdulghani, A. (1996) DataMine: Application Programming Interface and Query Language for Database Mining. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 256–261.Google Scholar
  8. 8.
    Kaufman, K. (1997) INLEN: A Methodology and Integrated System for Knowledge Discovery in Databases. Ph.D. dissertation, George Mason University, Fairfax, VA.Google Scholar
  9. 9.
    Kaufman, K. and Michalski, R.S. (1998) Discovery Planning: Multistrategy Learning in Data Mining. Proceedings of the Fourth International Workshop on Multistrategy Learning, 14–20.Google Scholar
  10. 10.
    Kaufman, K. and Michalski, R.S. (1999) Learning From Inconsistent and Noisy Data: The AQ18 Approach. Proceedings of the Eleventh International Symposium on Methodologies for Intelligent Systems.Google Scholar
  11. 11.
    Mannila, H. (1997) Inductive Databases and Condensed Representations for Data Mining. in Maluszynski, J. (ed.), Proceedings of the International Logic Programming Symposium, MIT Press, Cambridge.Google Scholar
  12. 12.
    Meo, R., Psaila, G. and Ceri, S. (1996) A New SQL-like Operator for Mining Association Rules. Proceedings of the 22nd VLDB Conference.Google Scholar
  13. 13.
    Michalski, R. S. (1975) Synthesis of Optimal and Quasi-Optimal Variable-Valued Logic Formulas. Proceedings of the 1975 International Symposium on Multiple-Valued Logic, 76–87.Google Scholar
  14. 14.
    Michalski, R. S. (1976) Class notes for the course on Databases, Computer Science Department, University of Illinois at Champaign-Urbana.Google Scholar
  15. 15.
    Michalski, R. S. (1983) A Theory and Methodology of Inductive Learning. In Michalski, R.S. Carbonell, J.G. and Mitchell, T.M. (eds.), Machine Learning: An Artificial Intelligence Approach, Tioga Publishing, Palo Alto, 83–129.Google Scholar
  16. 16.
    Michalski, R.S. (1997) Seeking Knowledge in the Deluge of Facts. Fundamenta Informaticae 30, 283–297.Google Scholar
  17. 17.
    Michalski, R.S. (2000) NATURAL INDUCTION: Theory, Methodology and Applications to Machine Learning and Knowledge Mining. Reports of the Machine Learning and Inference Laboratory, George Mason University, Fairfax, VA (to appear).Google Scholar
  18. 18.
    Michalski R. S. and Kaufman, K. (1998) Data Mining and Knowledge Discovery: A Review of Issues and Multistrategy Methodology. In Michalski, R.S., Bratko, I. and Kubat, M. (eds.), Machine Learning and Data Mining: Methods and Applications, John Wiley & Sons, London, 71–112.Google Scholar
  19. 19.
    Michalski, R.S. and Kaufman, K. (2000) Building Knowledge Scouts Using KGL Metalanguage. Fundamenta Informaticae 40, 433–447.Google Scholar
  20. 20.
    Michalski, R.S. and Stepp, R. (1983) Automated Construction of Classifications: Conceptual Clustering versus Numerical Taxonomy. IEEE Trans. on Pattern Analysis and Machine Intelligence 5, 396–410.Google Scholar
  21. 21.
    Reinke, R.E. (1984) Knowledge Acquisition and Refinement Tools for the ADVISE Meta-Expert System. Master’s Thesis, Department of Computer Science, University of Illinois, Urbana, IL.Google Scholar
  22. 22.
    Sarawagi, S., Thomas, S. and Agrawal, R. (1998) Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications. Proceedings of the ACM SIGMOD International Conference on Management of Data.Google Scholar
  23. 23.
    Zhang, Q. and Michalski, R.S. (2000) KV: A Knowledge Visualization System Employing General Logic Diagrams. Reports of the Machine Learning and Inference Laboratory, George Mason University, Fairfax, VA (to appear).Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Kenneth A. Kaufman
    • 1
  • Ryszard S. Michalski
    • 1
    • 2
  1. 1.Machine Learning and Inference LaboratoryGeorge Mason UniversityFairfaxUSA
  2. 2.Polish Academy of SciencesInstitute of Computer ScienceWarsawPoland

Personalised recommendations