Drifting Concepts as Hidden Factors in Clinical Studies
- 452 Downloads
Most statistical, Machine Learning and Data Mining algorithms assume that the data they use is a random sample drawn from a stationary distribution. Unfortunately, many of the databases available for mining today violate this assumption. They were gathered over months or years, and the underlying processes generating them may have changed during this time, sometimes radically (this is also known as a concept drift). In clinical institutions, where the patients’ data are regularly stored in a central computer databases, similar situations may occur. Expert physicians may easily, even unconsciously, adapt to the changed environment, whereas Machine Learning and Data Mining tools may fail due to their underlaying assumptions. It is therefore important to detect and adapt to the changed situation. In the paper we review several techniques for dealing with concept drift in Machine Learning and Data Mining frameworks and evaluate their use in clinical studies with a case study of coronary artery disease diagnostics.
Keywordsconcept drift partial memory learning windowing gradual forgetting clinical studies Machine Learning Data Mining
Unable to display preview. Download preview PDF.
- 1.Cohen, W.W.: Fast effective rule induction. In: Prieditis, A., Russel, S. (eds.) Proc. 12th Intl. Conf. on Machine Learning ICML 1995, San Francisco, California, USA, pp. 115–123. Morgan Kaufmann, San Francisco (1995)Google Scholar
- 2.Esposito, F., Malerba, D., Semeraro, G.: Simplifying decision trees by pruning and grafting: new results. In: Lavrac, N., Wrobel, S. (eds.) ECML 1995. LNCS, vol. 912, pp. 287–290. Springer, Heidelberg (1995)Google Scholar
- 4.Grošelj, C., Kukar, M., Fettich, J., Kononenko, I.: Machine learning improves the accuracy of coronary artery disease diagnostic methods. In: Proc. Computers in Cardiology, Lund, Sweden, vol. 24, pp. 57–60 (1997)Google Scholar
- 7.Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the 17th ACM SIGKDD Inter. Conf. on Knowledge Discovery and Data Mining, San Francisco, CA, pp. 97–106. ACM Press, New York (2001)Google Scholar
- 8.Klinkenberg, R., Joachims, T.: Detecting concept drift with support vector machines. In: Langley, P. (ed.) Proceedings of ICML 2000, 17th International Conference on Machine Learning, Stanford, US, pp. 487–494. Morgan Kaufmann Publishers, San Francisco (2000)Google Scholar
- 9.Koychev, I.: Gradual forgetting for adaptation to concept drift. In: Proceedings of ECAI 2000 Workshop Current Issues in Spatio-Temporal Reasoning, Berlin, Germany, pp. 101–106 (2000)Google Scholar
- 14.Syed, N.A., Liu, H., Sung, K.K.: Handling concept drifts in incremental learning with support vector machines. Knowledge Discovery and Data Mining, 317–321 (1999)Google Scholar
- 15.Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Machine Learning 23(1), 69–101 (1996)Google Scholar