Skip to main content

Cost Sensitive SVM with Non-informative Examples Elimination for Imbalanced Postoperative Risk Management Problem

  • Conference paper
Advances in Systems Science

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 240))

  • 2231 Accesses

Abstract

In this paper we propose a novel combined approach to solve the imbalanced data issue in the application to the problem of the post-operative life expectancy prediction for the lung cancer patients. This solution makes use of undersampling techniques together with cost-sensitive SVM (Support Vector Machines). First, we eliminate non-informative examples by applying Tomek links together with one-sided selection. Second, we take advantage of using cost-sensitive SVM with penalty costs calculated respecting cardinalities of minority and majority examples. We evaluate the presented solution by comparing the performance of our method with SVM-based approaches that deal with uneven data. The experimental evaluation was performed on real-life data from the postoperative risk management domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Batista, G.E., Prati, R.C., Monard, M.C.: A study of the behaviour of several methods for balancing machine learning training data. ACM SIGKDD Explorations Newsletter 6(1), 20–29 (2004)

    Article  Google Scholar 

  2. Chang, E.Y., Li, B., Wu, G., Goh, K.: Statistical learning for effective visual information retrieval. In: Proceedings of the 2003 International Conference on Image Processing, vol. 3, pp. 609–613. IEEE (2003)

    Google Scholar 

  3. Chawla, N.V., Lazarevic, A., Hall, L.O., Bowyer, K.W.: SMOTEBoost: Improving prediction of the minority class in boosting. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 107–119. Springer, Heidelberg (2003)

    Google Scholar 

  4. Chawla, N.V., Bowyer, K.W., Hall, L.O.: SMOTE: Synthetic Minority Over-sampling TEchnique. Journal of Artificial Intelligence Research 16, 321–357 (2002)

    MATH  Google Scholar 

  5. Chen, S., He, H., Garcia, E.A.: Ramoboost: Ranked minority oversampling in boosting. IEEE Transactions on Neural Networks 21(10), 1624–1642 (2010)

    Article  Google Scholar 

  6. Elkan, C.: The foundations of cost-sensitive learning. In: The Proceedings of International Joint Conference on Artificial Intelligence, vol. 17, pp. 973–978. Lawrence Erlbaum Associates, Ltd. (2001)

    Google Scholar 

  7. Ertekin, S., Huang, J., Giles, C.L.: Active learning for class imbalance problem. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, pp. 823–824. ACM (2007)

    Google Scholar 

  8. Galar, M., Fernández, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches. IEEE Transactions on Systems, Man and Cybernetics-Part C: Applications and Reviews 42(4), 3358–3378 (2012)

    Article  Google Scholar 

  9. Guo, H., Viktor, H.L.: Learning from imbalanced data sets with boosting and data generation: the databoost-im approach. ACM SIGKDD Explorations Newsletter 6(1), 30–39 (2004)

    Article  Google Scholar 

  10. He, H., Garcia, E.A.: Learning from Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering 21(9), 1263–1284 (2009)

    Article  Google Scholar 

  11. Kubat, M., Matwin, S.: et al. Addressing the curse of imbalanced training sets: one-sided selection. In: ICML, pp. 179–186. Morgan Kaufmann Publishers (1997)

    Google Scholar 

  12. Kukar, M., Kononenko, I.: Cost-sensitive learning with neural networks. In: Proceedings of the 13th European Conference on Artificial Intelligence (ECAI 1998), pp. 445–449. Citeseer (1998)

    Google Scholar 

  13. Morik, K., Brockhausen, P., Joachims, T.: Combining statistical learning with a knowledge-based approach-a case study in intensive care monitoring. In: Proceedings of the Sixteenth International Conference on Machine Learning (ICML 1999), pp. 268–277. Morgan Kaufmann (1999)

    Google Scholar 

  14. Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods: Support Vector Learning. MIT Press, Cambridge (1999)

    Google Scholar 

  15. Sun, Y., Kamel, M., Wong, A., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recognition 40(12), 3358–3378 (2007)

    Article  MATH  Google Scholar 

  16. Tang, Y., Jin, B., Zhang, Y.Q.: Granular support vector machines with association rules mining for protein homology prediction. Artificial Intelligence in Medicine 35(1-2), 121–134 (2005)

    Article  MATH  Google Scholar 

  17. Tomek, I.: Two Modifications of CNN. IEEE Transactions on Systems, Man and Cybernetics 6(11), 769–772 (1976)

    Article  MATH  MathSciNet  Google Scholar 

  18. Veropoulos, K., Campbell, C., Cristianini, N.: Controlling the sensitivity of support vector machines. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCA I999), Workshop ML3, vol. 1999, pp. 55–60 (1999)

    Google Scholar 

  19. Wang, S., Yao, X.: Diversity analysis on imbalanced data sets by using ensemble models. In: 2009 IEEE Symposium on Computational Intelligence and Data Mining Proceedings, pp. 324–331. IEEE (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Zięba, M., Świątek, J., Lubicz, M. (2014). Cost Sensitive SVM with Non-informative Examples Elimination for Imbalanced Postoperative Risk Management Problem. In: Swiątek, J., Grzech, A., Swiątek, P., Tomczak, J. (eds) Advances in Systems Science. Advances in Intelligent Systems and Computing, vol 240. Springer, Cham. https://doi.org/10.1007/978-3-319-01857-7_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-01857-7_29

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-01856-0

  • Online ISBN: 978-3-319-01857-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics