Advertisement

Drifting Concepts as Hidden Factors in Clinical Studies

  • Matjaž Kukar
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2780)

Abstract

Most statistical, Machine Learning and Data Mining algorithms assume that the data they use is a random sample drawn from a stationary distribution. Unfortunately, many of the databases available for mining today violate this assumption. They were gathered over months or years, and the underlying processes generating them may have changed during this time, sometimes radically (this is also known as a concept drift). In clinical institutions, where the patients’ data are regularly stored in a central computer databases, similar situations may occur. Expert physicians may easily, even unconsciously, adapt to the changed environment, whereas Machine Learning and Data Mining tools may fail due to their underlaying assumptions. It is therefore important to detect and adapt to the changed situation. In the paper we review several techniques for dealing with concept drift in Machine Learning and Data Mining frameworks and evaluate their use in clinical studies with a case study of coronary artery disease diagnostics.

Keywords

concept drift partial memory learning windowing gradual forgetting clinical studies Machine Learning Data Mining 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Cohen, W.W.: Fast effective rule induction. In: Prieditis, A., Russel, S. (eds.) Proc. 12th Intl. Conf. on Machine Learning ICML 1995, San Francisco, California, USA, pp. 115–123. Morgan Kaufmann, San Francisco (1995)Google Scholar
  2. 2.
    Esposito, F., Malerba, D., Semeraro, G.: Simplifying decision trees by pruning and grafting: new results. In: Lavrac, N., Wrobel, S. (eds.) ECML 1995. LNCS, vol. 912, pp. 287–290. Springer, Heidelberg (1995)Google Scholar
  3. 3.
    Grabtree, I., Soltysiak, S.: Identifying and tracking changing interests. International Journal of Digital Libraries 2, 38–53 (1998)CrossRefGoogle Scholar
  4. 4.
    Grošelj, C., Kukar, M., Fettich, J., Kononenko, I.: Machine learning improves the accuracy of coronary artery disease diagnostic methods. In: Proc. Computers in Cardiology, Lund, Sweden, vol. 24, pp. 57–60 (1997)Google Scholar
  5. 5.
    Harries, M.B., Sammut, C., Horn, K.: Extracting hidden context. Machine Learning 32, 101–126 (1998)zbMATHCrossRefGoogle Scholar
  6. 6.
    Helmbold, D.P., Long, P.M.: Tracking drifting concepts by minimizing disagreements. Machine Learning 14, 27–45 (1994)zbMATHGoogle Scholar
  7. 7.
    Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the 17th ACM SIGKDD Inter. Conf. on Knowledge Discovery and Data Mining, San Francisco, CA, pp. 97–106. ACM Press, New York (2001)Google Scholar
  8. 8.
    Klinkenberg, R., Joachims, T.: Detecting concept drift with support vector machines. In: Langley, P. (ed.) Proceedings of ICML 2000, 17th International Conference on Machine Learning, Stanford, US, pp. 487–494. Morgan Kaufmann Publishers, San Francisco (2000)Google Scholar
  9. 9.
    Koychev, I.: Gradual forgetting for adaptation to concept drift. In: Proceedings of ECAI 2000 Workshop Current Issues in Spatio-Temporal Reasoning, Berlin, Germany, pp. 101–106 (2000)Google Scholar
  10. 10.
    Kukar, M.: Making reliable diagnoses with machine learning: A case study. In: Quaglini, S., Barahona, P., Andreassen, S. (eds.) AIME 2001. LNCS (LNAI), vol. 2101, pp. 88–96. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  11. 11.
    Kukar, M., Kononenko, I.: Reliable classifications with Machine Learning. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, p. 219. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  12. 12.
    Kukar, M., Kononenko, I., Grošelj, C., Kralj, K., Fettich, J.: Analysing and improving the diagnosis of ischaemic heart disease with machine learning. Artificial Intelligence in Medicine 16(1), 25–50 (1999)CrossRefGoogle Scholar
  13. 13.
    Maloof, M.A., Michalski, R.S.: Selecting examples for partial memory learning. Machine Learning 41(1), 27–52 (2000)CrossRefGoogle Scholar
  14. 14.
    Syed, N.A., Liu, H., Sung, K.K.: Handling concept drifts in incremental learning with support vector machines. Knowledge Discovery and Data Mining, 317–321 (1999)Google Scholar
  15. 15.
    Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Machine Learning 23(1), 69–101 (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Matjaž Kukar
    • 1
  1. 1.Faculty of Computer and Information ScienceUniversity of LjubljanaLjubljanaSlovenia

Personalised recommendations