Skip to main content

Data Mining Techniques for Disease Risk Prediction Model: A Systematic Literature Review

  • Conference paper
  • First Online:
Recent Trends in Data Science and Soft Computing (IRICT 2018)

Abstract

Risk prediction model estimates event occurrence based on related data. Conventional statistical metrics that utilized primary data generates simple descriptive analysis that often provide insufficient knowledge for decision making. In contrast, data mining techniques that have the capability to find hidden pattern from the secondary data in large databases and create prediction for de- sired output has become a popular approach to develop any risk prediction model. In healthcare particularly, data mining techniques can be applied in disease risk prediction model to provide reliable prediction on the possibility of acquiring the disease based on individual’s clinical and non-clinical data. Due to the increased use of data mining in healthcare, this study aims at identifying the data mining techniques and algorithms that are commonly implemented in studies related to various disease risk prediction model as well as finding the accuracy of the algorithms. The accuracy evaluation consists of various method, but this paper is focusing on overall accuracy which is measured by the total number of correctly predicted output over the total number of prediction. A systematic literature review approach that search across five databases found 170 articles, of which 7 articles were selected in the final process. This review found that most prediction model used classification technique, with a focus on decision tree, neural network, support vector machines, and Naïve Bayes algorithms where heart-related disease is commonly studied. Further research can apply similar algorithms to develop risk prediction model for other types of diseases, such as infectious disease prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kim, Y.-K., Jeong, C.-S.: Risk prediction system based on risk prediction model with complex event processing: risk prediction in real time on complex event processing engine. In: 2014 IEEE Fourth International Conference on Big Data and Cloud Computing, pp. 711–715 (2014)

    Google Scholar 

  2. Takci, H.: Improvement of heart attack prediction by the feature selection methods. Turk. J. Electr. Eng. Comput. Sci. 26, 1–10 (2018)

    Article  Google Scholar 

  3. WHO: Non-communicable diseases (2017). http://www.who.int/news-room/fact-sheets/detail/noncommunicable-diseases. Accessed 28 May 2018

  4. Tomar, D., Agarwal, S.: A survey on data mining approaches for healthcare. Int. J. Bio-Sci. Bio-Technol. 5(5), 241–266 (2013)

    Article  Google Scholar 

  5. Srivastava, J., Srivastava, A.K.: Understanding linkage between data mining and statistics. IJETMAS 3(10), 4–12 (2015)

    Google Scholar 

  6. Patel, S., Patel, H.: Survey of data mining techniques used in healthcare domain. Int. J. Inf. Sci. Technol. 62(1), 1–8 (2016)

    Google Scholar 

  7. Kesavaraj, G., Sukumaran, S.: A study on classification techniques in data mining. In: 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT), pp. 1–7 (2013)

    Google Scholar 

  8. Leopord, H., Kipruto Cheruiyot, W., Kimani, S.: A survey and analysis on classification and regression data mining techniques for diseases outbreak prediction in datasets. Int. J. Eng. Sci. 5, 1–11 (2016)

    Google Scholar 

  9. Sagar, P.: Analysis of prediction techniques based on classification and regression. Int. J. Comput. Appl. 163(7), 975–8887 (2017)

    Google Scholar 

  10. Sharma, A., Kaur, B.: A research review on comparative analysis of data mining tools, techniques and parameters. Int. J. Adv. Res. Comput. Sci (2017). https://doi.org/10.26483/ijarcs.v8i7.4255

    Article  Google Scholar 

  11. Kim, J., Lee, J., Lee, Y.: Data-mining-based coronary heart disease risk prediction model using fuzzy logic and decision tree. Healthc. Inform. Res. 21(3), 167 (2015)

    Article  Google Scholar 

  12. Li, H., et al.: An artificial neural network prediction model of congenital heart disease based on risk factors. Medicine (Baltimore) 96(6), e6090 (2017)

    Article  Google Scholar 

  13. Yazdani, A., Ramakrishnan, K.: Performance evaluation of artificial neural network models for the prediction of the risk of heart disease. In: Ibrahim, F., Usman, J., Mohktar, M., Ahmad, M. (eds.) International Conference for Innovation in Biomedical Engineering and Life Sciences. ICIBEL 2015. IFMBE Proceedings, vol. 56, pp. 179–182. Springer, Singapore (2016)

    Google Scholar 

  14. Pak, M., Shin, M.: Developing disease risk prediction model based on environmental factors. In: The 18th IEEE International Symposium on Consumer Electronics (ISCE 2014), pp. 1–2 (2014)

    Google Scholar 

  15. Radha, P., Prof, A.: Hybrid prediction model for the risk of cardiovascular disease in type-2 diabetic patients. Int. J. Adv. Res. Comput. Sci. Manag. Stud. 2(10), 2321–7782 (2014)

    Google Scholar 

  16. Wang, X., Guo, S., Han, L.: GW28-e0440 the risk prediction model of coronary heart disease for elderly hypertensive patients. J. Am. Coll. Cardiol. 70(16), C72 (2017)

    Article  Google Scholar 

  17. Dash, S., Das, M.N., Mishra, B.K.: Implementation of an optimized classification model for prediction of hypothyroid disease risks. In: 2016 International Conference on Inventive Computation Technologies (ICICT), pp. 1–4 (2016)

    Google Scholar 

Download references

Acknowledgement

This work was supported by the Ministry of Higher Education, Malaysia under FRGS/1/2016/ICT04/UNITEN/03/1.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wan Muhamad Taufik Wan Ahmad .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ahmad, W.M.T.W., Ghani, N.L.A., Drus, S.M. (2019). Data Mining Techniques for Disease Risk Prediction Model: A Systematic Literature Review. In: Saeed, F., Gazem, N., Mohammed, F., Busalim, A. (eds) Recent Trends in Data Science and Soft Computing. IRICT 2018. Advances in Intelligent Systems and Computing, vol 843. Springer, Cham. https://doi.org/10.1007/978-3-319-99007-1_4

Download citation

Publish with us

Policies and ethics