Skip to main content

Bayesian Network Retrieval Discrimination Criteria Model Based on Unbalanced Information

  • Conference paper
  • First Online:
Smart Health (ICSH 2018)

Abstract

Unbalanced sample data are usually ignored in the process of case matching, but these data also lead to misclassification during case matching. To solve this problem, a discrimination criteria model based on the Bayesian network and corresponding algorithm is proposed in our paper. The Bayesian network cost sensitivity learning in this model uses the minimization theorem of loss function. We also introduce a ROC curve to evaluate the performance of the retrieval model and verify the validity of the model by using diagnostic data for clinical heart disease. Our results indicate that this method can effectively eliminate the cost sensitivity of imbalanced datasets and improve the accuracy of the retrieval results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Masnadi-Shirazi, H., Vasconcelos, N.: Cost-sensitive boosting. IEEE Trans. Pattern Anal. Mach. Intell. 2, 294–309 (2011)

    Article  Google Scholar 

  2. Batista, G.E., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor. Newsl. 6(1), 20–29 (2004)

    Article  Google Scholar 

  3. He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 9, 1263–1284 (2009)

    Google Scholar 

  4. Pandey, B., Mishra, R.B.: Knowledge and intelligent computing system in medicine. Comput. Biol. Med. 39(3), 215–230 (2009)

    Article  Google Scholar 

  5. Jing, Z., Lin, F.: An algorithm of robust online extreme learning machine for dynamic imbalanced datasets. Comput. Res. Dev. 52(7), 1487–1498 (2015)

    Google Scholar 

  6. Weber, B., Reichert, M., Rinderle-Ma, S.: Change patterns and change support features–enhancing flexibility in process-aware information systems. Data Knowl. Eng. 66(3), 438–466 (2008)

    Article  Google Scholar 

  7. Bohmer, R.M.J.: Fixing health care on the front lines. Harv. Bus. Rev. 4(1), 1–7 (2010)

    Google Scholar 

  8. Rajput, Q.N., Haider, S.: Use of Bayesian network in information extraction from unstructured data sources. Int. J. Inf. Technol. 4, 207–213 (2009)

    Google Scholar 

  9. Uramoto, N., Matsuzawa, H., et al.: A text-mining system for knowledge discovery from biomedical documents. IBM Syst. J. 43(3), 516–533 (2010)

    Article  Google Scholar 

  10. Heeb, N.V., Bach, C., Am, M.P.F.: On the 2-tuples based genetic tuning performance for fuzzy rule based classification systems in imbalanced data-sets. Inf. Sci. 180(8), 1268–1291 (2010)

    Article  MathSciNet  Google Scholar 

  11. Gao, Y.F., Tang, Y.L., Chen, Y.W.: Bayesian networks structure learning based on cost-sensitive criterion. J. Chin. Comput. Syst. 30(2), 313–316 (2009)

    Google Scholar 

  12. Waegeman, W., De Baets, B., Boullart, L.: ROC analysis in ordinal regression learning. Pattern Recognit. Lett. 29(1), 1–9 (2008)

    Article  Google Scholar 

  13. Zhang, X., Li, X., Feng, Y., et al.: The use of ROC and AUC in the validation of objective image fusion evaluation metrics. Signal Process. 115, 38–48 (2015)

    Article  Google Scholar 

  14. Dmochowski, J.P., Sajda, P., et al.: Maximum likelihood in cost-sensitive learning: model specification, approximations, and upper bounds. J. Mach. Learn. Res. 11(18), 3313–3332 (2010)

    MathSciNet  MATH  Google Scholar 

  15. Dey, D., Sarkar, S., De, P.: A probabilistic decision model for entity matching in heterogeneous Databases. Manag. Sci. 44(10), 1379–1387 (1998)

    Article  Google Scholar 

  16. Asuncion, A., Newman, D.: UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences (2007). http://www.ics.uci.edu/mlearn/MLRepository.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiang Shen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xu, M., Gan, D., Shen, J., An, B. (2018). Bayesian Network Retrieval Discrimination Criteria Model Based on Unbalanced Information. In: Chen, H., Fang, Q., Zeng, D., Wu, J. (eds) Smart Health. ICSH 2018. Lecture Notes in Computer Science(), vol 10983. Springer, Cham. https://doi.org/10.1007/978-3-030-03649-2_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-03649-2_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-03648-5

  • Online ISBN: 978-3-030-03649-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics