Risk Prediction Analysis of Cardiovascular Disease Using Supervised Machine Learning Techniques

  • A. Ishwarya
  • S. K. Jayanthi
Conference paper


The best thing to avoid strategic human death rates due to curable diseases is to detect them early and prevent their onset. Presently, in our society, large numbers of death rates are due to cardiovascular disease (CVD). Hence early detection of CVD is critical even though many more practices exist for earlier prediction of risk. One approach for early disease risk prediction is the use of risk prediction models developed using machine learning techniques. These models will provide clinicians to treat heart disease of the patient in a better way. Consequently in this chapter, classification mechanisms have been applied to predict the status of the disease. The machine learning algorithms involved in the prediction of CVD are EDC-AIRS, Decision Tree, and SVM. The heart disease dataset from UCI repository has been used in this study. The predictions are denoted by means of accuracy, whereas the performance measures have been calculated in terms of sensitivity, specificity, and F-measure. Results indicate that the prediction model developed using the SVM algorithm is capable of achieving high sensitivity, specificity, balanced accuracy, and F-measure. Further, these models can be integrated into a computer-aided screening tool which clinicians can use to predict the risk status of CVD after performing the necessary clinical assessments.


Cardiovascular disease Classification Clinical risk prediction 



Cardiovascular disease


Evolutionary data-conscious artificial immune recognition system


K-Nearest Neighbor


Support vector machines


Decision tree


High-density lipoprotein


Low-density lipoprotein


Chest pain


Fasting blood sugar


Information gain


Gini Index


Cholesterol level


Resting blood pressure


  1. 1.
    Wilson PW, D’agostino RB, Levy D, Belanger AM, Silbershatz H, Kannel WB (1998) Prediction of coronary heart disease using risk factor categories. Circ Cardiovasc Genet 97(18):1837–1847Google Scholar
  2. 2.
    Abbott R, Curb J, Rodriguez B, Masaki K, Yano K, Schatz I, Ross G, Petrovitch H (2002) Age-related changes in risk factor effects on the incidence of coronary heart disease. Ann Epidemiol 12(3):173–181CrossRefGoogle Scholar
  3. 3.
    Sen SK (2017) Predicting and diagnosing of heart disease using machine learning algorithms. Int J Eng Comput Sci 6(6):21623–21631. ISSN:2319-7242Google Scholar
  4. 4.
    Osuna E, Freund R, Girosit F (1997) Training support vector machines: an application to face detection. In: Proceedings of Computer Vision and Pattern Recognition, pp 130–136Google Scholar
  5. 5.
    Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tibshirani R, Botstein D, Altman R (2001) Missing value estimation methods for DNA microarrays. Bioinformatics 17(6):520–525CrossRefGoogle Scholar
  6. 6.
    Tay D, Poh CL, Van Reeth E, Kitney RI (2015) The effect of Sample age and prediction resolution on myocardial infarction risk prediction. IEEE J Biomed Health Inform 19(3):1178–1185CrossRefGoogle Scholar
  7. 7.
    Bia D, Zocalo Y, Farro I, Torrado J, Florio L, Lluberas R, Armentano R (2012) Health informatics design for assisted diagnosis of sub-clinical atherosclerosis, structural and functional arterial age calculus and patient-specific cardiovascular risk evaluation. IEEE Trans Inf Technol Biomed 16(5):943–951CrossRefGoogle Scholar
  8. 8.
    Hsiao HCW, Chen SHF, Tsai JJP (2016) Deep learning for risk analysis of specific cardiovascular diseases using environmental data and outpatient records. In: IEEE 16th international conference on Bioinformatics and Bioengineering (BIBE). ISSN:2471-7819, pp 369–372Google Scholar
  9. 9.
    Tay D, Poh C, Kitney R (2013) An evolutionary data-conscious artificial immune recognition system. In: Proceedings of 15th annual conference on Genetic and Evolutionary Computation Conference, pp 1101–1108Google Scholar
  10. 10.
    Song X, Mitnitski A, Cox J, Rockwood K (2004) Comparison of machine learning techniques with classical statistical models in predicting health outcomes. Medinfo 107(1):736–740Google Scholar
  11. 11.
  12. 12.
    Jerez J, Molina I, Garcia-Laencina P, Alba E, Ribelles N, Martn M, Franco L (2010) Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artif Intell Med 50(2):105–115CrossRefGoogle Scholar
  13. 13.
    Suresh Kumar TV, Eswara Reddy B, Kallimani JS. Data mining principles and applications. Elsevier, A Division of Reed Elsevier India Private Limited, AmsterdamGoogle Scholar
  14. 14.
    Han J, Kambar M Data mining concepts and techniques. Morgan Kaufmann Publishers, BostonGoogle Scholar
  15. 15.
    Pujari AK. Data mining techniques. Universities Press (India) Private Limited, HyderabadGoogle Scholar
  16. 16.
    Phan JH, Quo C, Wang M (2012) Cardiovascular genomics: a biomarker identification pipeline. IEEE Trans Inf Technol Buiomed 16(5):809–822CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • A. Ishwarya
    • 1
  • S. K. Jayanthi
    • 1
  1. 1.Department of Computer ScienceVellalar College for WomenErodeIndia

Personalised recommendations