Skip to main content

Cost-Sensitive Decision Trees Applied to Medical Data

  • Conference paper
Book cover Data Warehousing and Knowledge Discovery (DaWaK 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4654))

Included in the following conference series:

Abstract

Classification plays an important role in medicine, especially for medical diagnosis. Health applications often require classifiers that minimize the total cost, including misclassifications costs and test costs. In fact, there are many reasons for considering costs in medicine, as diagnostic tests are not free and health budgets are limited. Our aim with this work was to define, implement and test a strategy for cost-sensitive learning. We defined an algorithm for decision tree induction that considers costs, including test costs, delayed costs and costs associated with risk. Then we applied our strategy to train and evaluate cost-sensitive decision trees in medical data. Built trees can be tested following some strategies, including group costs, common costs, and individual costs. Using the factor of “risk” it is possible to penalize invasive or delayed tests and obtain decision trees patient-friendly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cios, K.J. (ed.): Medical Data Mining and Knowledge Discovery. Physica-Verlag, New York (2001)

    MATH  Google Scholar 

  2. Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.): Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press (1996)

    Google Scholar 

  3. Witten, I.H., Frank, E.: Data mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  4. Coiera, E.: Guide to Health Informatics, 2nd edn. A Hodder Arnold Publication (2003)

    Google Scholar 

  5. Turney, P.: Types of Cost in Inductive Concept Learning. In: Proc. Workshop on Cost-Sensitive Learning, 17th Int. Conf. Machine Learning, pp. 15–21 (2000)

    Google Scholar 

  6. Fawcett, T., Provost, F.: Activity Monitoring: Noticing Interesting Changes in Behavior. In: Proc. 5th Int. Conf. Knowledge Discovery and Data Mining, pp. 53–62 (1999)

    Google Scholar 

  7. Breiman, L., Freidman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees, Wadsworth, Belmont California (1984)

    Google Scholar 

  8. Elkan, C.: The Foundations of Cost-Sensitive Learning. In: Proc. 17th Int. Joint Conf. Artificial Intelligence, pp. 973–978 (2001)

    Google Scholar 

  9. Núñez, M.: The Use of Background Knowledge in Decision Tree Induction. Machine learning 6, 231–250 (1991)

    Google Scholar 

  10. Melville, P., Provost, F., Saar-Tsechansky, M., Mooney, R.: Economical Active Feature-Value Acquisition Through Expected Utility Estimation. In: Proc. 1st Int. Workshop on Utility-Based Data Mining, pp. 10–16 (2005)

    Google Scholar 

  11. Turney, P.: Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm. J. Artificial Intelligence Research 2, 369–409 (1995)

    Google Scholar 

  12. Zubek, V.B., Dietterich, T.: Pruning Improves Heuristic Search for Cost-Sensitive Learning. In: Proc. 19th Int. Conf. Machine Learning, pp. 27–35 (2002)

    Google Scholar 

  13. Greiner, R., Grove, A.J., Roth, D.: Learning Cost-Sensitive Active Classifiers. Artificial Intelligence 139(2), 137–174 (2002)

    Article  MathSciNet  Google Scholar 

  14. Arnt, A., Zilberstein, S.: Attribute Measurement Policies for Cost-effective Classification. In: Workshop Data Mining in Resource Constrained Environments, 4th Int. Conf. Data Mining (2004)

    Google Scholar 

  15. Chai, X., Deng, L., Yang, Q., Ling, C.X.: Test-Cost Sensitive Naive Bayes Classification. In: Proc. 4th Int. Conf. Data Mining (2004)

    Google Scholar 

  16. Ling, C.X., Yang, Q., Wang, J., Zhang, S.: Decision Trees with Minimal Costs. In: Proc. 21st Int. Conf. Machine Learning (2004)

    Google Scholar 

  17. Sheng, S., Ling, C.X., Yang, Q.: Simple Test Strategies for Cost-Sensitive Decision Trees. In: Proc. 16th European Conf. Machine Learning, pp. 365–376 (2005)

    Google Scholar 

  18. Sheng, S., Ling, C.X.: Hybrid Cost-sensitive Decision Tree. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  19. Zhang, S., Qin, Z., Ling, C.X., Sheng, S.: Missing Is Useful: Missing Values in Cost-Sensitive Decision Trees. IEEE Transactions on Knowledge and Data Engineering 17(12), 1689–1693 (2005)

    Article  Google Scholar 

  20. Ling, C.X., Sheng, V.S., Yang, Q.: Test Strategies for Cost-Sensitive Decision Trees. IEEE Transactions on Knowledge and Data Engineering 18(8), 1055–1067 (2006)

    Article  Google Scholar 

  21. Grobman, W.A., Stamilio, D.M.: Methods of Clinical Prediction. American Journal of Obstetrics and Gynecology 194(3), 888–894 (2006)

    Article  Google Scholar 

  22. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)

    Google Scholar 

  23. Drummond, C., Holte, R.C.: Exploiting the Cost (In)sensitivity of Decision Tree Splitting Criteria. In: Proc. 17th Int. Conf. Machine Learning, pp. 239–246 (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Il Yeal Song Johann Eder Tho Manh Nguyen

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Freitas, A., Costa-Pereira, A., Brazdil, P. (2007). Cost-Sensitive Decision Trees Applied to Medical Data. In: Song, I.Y., Eder, J., Nguyen, T.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2007. Lecture Notes in Computer Science, vol 4654. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74553-2_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74553-2_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74552-5

  • Online ISBN: 978-3-540-74553-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics