Abstract
Classification plays an important role in medicine, especially for medical diagnosis. Health applications often require classifiers that minimize the total cost, including misclassifications costs and test costs. In fact, there are many reasons for considering costs in medicine, as diagnostic tests are not free and health budgets are limited. Our aim with this work was to define, implement and test a strategy for cost-sensitive learning. We defined an algorithm for decision tree induction that considers costs, including test costs, delayed costs and costs associated with risk. Then we applied our strategy to train and evaluate cost-sensitive decision trees in medical data. Built trees can be tested following some strategies, including group costs, common costs, and individual costs. Using the factor of “risk” it is possible to penalize invasive or delayed tests and obtain decision trees patient-friendly.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cios, K.J. (ed.): Medical Data Mining and Knowledge Discovery. Physica-Verlag, New York (2001)
Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.): Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press (1996)
Witten, I.H., Frank, E.: Data mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Coiera, E.: Guide to Health Informatics, 2nd edn. A Hodder Arnold Publication (2003)
Turney, P.: Types of Cost in Inductive Concept Learning. In: Proc. Workshop on Cost-Sensitive Learning, 17th Int. Conf. Machine Learning, pp. 15–21 (2000)
Fawcett, T., Provost, F.: Activity Monitoring: Noticing Interesting Changes in Behavior. In: Proc. 5th Int. Conf. Knowledge Discovery and Data Mining, pp. 53–62 (1999)
Breiman, L., Freidman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees, Wadsworth, Belmont California (1984)
Elkan, C.: The Foundations of Cost-Sensitive Learning. In: Proc. 17th Int. Joint Conf. Artificial Intelligence, pp. 973–978 (2001)
Núñez, M.: The Use of Background Knowledge in Decision Tree Induction. Machine learning 6, 231–250 (1991)
Melville, P., Provost, F., Saar-Tsechansky, M., Mooney, R.: Economical Active Feature-Value Acquisition Through Expected Utility Estimation. In: Proc. 1st Int. Workshop on Utility-Based Data Mining, pp. 10–16 (2005)
Turney, P.: Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm. J. Artificial Intelligence Research 2, 369–409 (1995)
Zubek, V.B., Dietterich, T.: Pruning Improves Heuristic Search for Cost-Sensitive Learning. In: Proc. 19th Int. Conf. Machine Learning, pp. 27–35 (2002)
Greiner, R., Grove, A.J., Roth, D.: Learning Cost-Sensitive Active Classifiers. Artificial Intelligence 139(2), 137–174 (2002)
Arnt, A., Zilberstein, S.: Attribute Measurement Policies for Cost-effective Classification. In: Workshop Data Mining in Resource Constrained Environments, 4th Int. Conf. Data Mining (2004)
Chai, X., Deng, L., Yang, Q., Ling, C.X.: Test-Cost Sensitive Naive Bayes Classification. In: Proc. 4th Int. Conf. Data Mining (2004)
Ling, C.X., Yang, Q., Wang, J., Zhang, S.: Decision Trees with Minimal Costs. In: Proc. 21st Int. Conf. Machine Learning (2004)
Sheng, S., Ling, C.X., Yang, Q.: Simple Test Strategies for Cost-Sensitive Decision Trees. In: Proc. 16th European Conf. Machine Learning, pp. 365–376 (2005)
Sheng, S., Ling, C.X.: Hybrid Cost-sensitive Decision Tree. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, Springer, Heidelberg (2005)
Zhang, S., Qin, Z., Ling, C.X., Sheng, S.: Missing Is Useful: Missing Values in Cost-Sensitive Decision Trees. IEEE Transactions on Knowledge and Data Engineering 17(12), 1689–1693 (2005)
Ling, C.X., Sheng, V.S., Yang, Q.: Test Strategies for Cost-Sensitive Decision Trees. IEEE Transactions on Knowledge and Data Engineering 18(8), 1055–1067 (2006)
Grobman, W.A., Stamilio, D.M.: Methods of Clinical Prediction. American Journal of Obstetrics and Gynecology 194(3), 888–894 (2006)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Drummond, C., Holte, R.C.: Exploiting the Cost (In)sensitivity of Decision Tree Splitting Criteria. In: Proc. 17th Int. Conf. Machine Learning, pp. 239–246 (2000)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Freitas, A., Costa-Pereira, A., Brazdil, P. (2007). Cost-Sensitive Decision Trees Applied to Medical Data. In: Song, I.Y., Eder, J., Nguyen, T.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2007. Lecture Notes in Computer Science, vol 4654. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74553-2_28
Download citation
DOI: https://doi.org/10.1007/978-3-540-74553-2_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74552-5
Online ISBN: 978-3-540-74553-2
eBook Packages: Computer ScienceComputer Science (R0)