Risk Factors Analysis and Classification on Heart Disease


In recent years, there has been a high prevalence rate of heart disease (HD) among 50-year-old people in China. It has become the first disease of old ages death. It is a very interesting and challenging work to have an effective early forecasting of the risk of HD according to the patients data. In this paper, we propose a novel method to analyze the factors with views of group features. Normalized mutual information based on entropies and information gain ratio are employed to select factors. Discriminant minimum class locality preserving canonical correlation analysis is presented to determine the effectiveness of the view of group factors. Moreover, a novel model is given to forecast the risks of New York Heart Association Functional Classification. To verify the effectiveness of the proposed method and model, we collected electronic health records of 1271 patients from 28 Chinese Level III-A hospitals in 2015. After the risk factors analysis, several results are concluded: (1) Patients with HD usually suffer from similar complications. For example, most patients with heart disease suffer from hypertension, diabetes and arrhythmia at the same time. (2) The risk forecasting has an accurate recognition rate. The risk value of the level of patients is impacted on the complications. (3) Hypertension, arrhythmia, chronic cardiac insufficiency and coronary disease are the highest concurrent diseases. There is a high reliability to have a decision of levels on the cardiac functional diseases according to the output of our proposed model.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12


  1. Aleix M, Avinash CK (2001) PCA versus LDA. IEEE Trans Pattern Anal Mach Intell 23(2):228–233

    Article  Google Scholar 

  2. Blokh D, Stambler I (2015) Information theoretical analysis of aging as a risk factor for heart disease. Aging Dis 6(3):196–207

    Article  Google Scholar 

  3. Chen W, Fan X et al (2016) Report on cardiovascular disease in China (2015). Chin Circul J 31(6):1

    Google Scholar 

  4. Chen W, Gao R, Liu L, Zhu M (2015) Report on cardiovascular diseases in China. Chin Circul J 30(7):617–622

    Google Scholar 

  5. Chinese Medical Association: Guidelines for The Diagnosis and Treatment of Heart Failure in China (2014). Chin J Cardiol 42(2):2014

  6. D’Angelo G, Pilla R, Tascini C, Rampone S (2019) A proposal for distinguishing between bacterial and viral meningitis using genetic programming and decision trees. Soft Comput 23:11775–11791

    Article  Google Scholar 

  7. Heidenreich PA, Albert NM et al (2013) Forecasting the impact of heart failure in the United States. NIH Public Access 6(3):606–619

    Google Scholar 

  8. Jabez CJ, Khanna NH et al (2015) A swarm optimization approach for clinical knowledge mining. Comput Methods Programs Biomed 121(3):137–138

    Article  Google Scholar 

  9. Kupper N, Bonhof C, Westerhuis B (2016) Determinants of dyspnea in chronic heart failure. J Cardiac Fail 22(3):201–209

    Article  Google Scholar 

  10. Lei B, Chen S et al (2016) Discriminative learning for Alzheimer’s disease diagnosis via canonical correlation analysis and multimodal fusion. Front Aging Neurosci 8:77

    Article  Google Scholar 

  11. Liu C, Wang W et al (2017) An efficient instance selection algorithm to reconstruct training set for support vector machine. Knowl-Based Syst 116:58–73

    Article  Google Scholar 

  12. Mendis S, Puska P, Norrving B (2011) Global Atlas on cardiovascular disease prevention and control. World Health Organization, Geneva, pp 3–18

    Google Scholar 

  13. Mercer AJ (2016) Long-term trends in cardiovascular disease mortality and association with respiratory disease. Epidemiol Infect 144(4):777–786

    Article  Google Scholar 

  14. Methaila A, Kansal P et al (2014) Early heart disease prediction using data mining techniques. Comput Sci Inf Technol 4(8):53–59

    Google Scholar 

  15. Peng Y, Zhang D, Zhang J (2010) A new canonical correlation analysis algorithm with local discrimination. Neural Process Lett 31(1):1–15

    Article  Google Scholar 

  16. Samuel OW, Asogbon GM et al (2017) An integrated decision support system based on ANN and Fuzzy\_AHP for heart failure risk prediction. Expert Syst Appl 68:163–172

    Article  Google Scholar 

  17. Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27(4):379C423

    MathSciNet  Article  Google Scholar 

  18. Stankovic I, Neskovic AN, Putnikovic B,Apostolovic S, Lainscak M, Edelmann F, Doehner W, Gelbrich G, Inkrot S, Rau T, Herrmann-Lingen C, Anker SD, Dngen HD (2012) Sinus rhythm versus atrial fibrillation in elderly patients with chronic heart failure—Insight from the Cardiac Insufficiency Bisoprolol Study in Elderly. Int J Cardiol 161(3):160–165

  19. Wang S, Jianfeng L et al (2016) Canonical principal angles correlation analysis for two-view data. J Vis Commun Image Represent 35:209–219

    Article  Google Scholar 

  20. Warren-Gash S, Liam H, Andrew C (2009) Influenza as a trigger for acute myocardial infarction or death from cardiovascular disease: a systematic review. Lancet Infect Dis 9(10):601–610

    Article  Google Scholar 

  21. Wen J, Fang X, Cui J, Fei L, Yan K, Chen Y, Xu Y (2019) Robust sparse linear discriminant analysis. IEEE Trans Circuits Syst Video Techonol 29(2):390–403

    Article  Google Scholar 

  22. Xing X, Wang K et al (2016) Complete canonical correlation analysis with application to multi-view gait recognition. Pattern Recognit 50:107–117

    Article  Google Scholar 

  23. Yubo Y, Chenglong M, Dongmei P (2016) A novel discriminant minimum class locality preserving canonical correlation analysis and its applications. J Ind Manag Optim 12(1):251–268

    MathSciNet  MATH  Google Scholar 

  24. Zhang X, Ding S et al (2017) An improved multiple birth support vector machine for pattern classification. Neurocomputing 225:119–128

    Article  Google Scholar 

  25. Zhang H, Wang P et al (2016) Risk factors of heart failure for patients classification with extreme learning machine. In: Proceeding of ICMLC2016 conference, Jeju, South Korea, pp 814–819

  26. Zhao T, Yuan Y, Wang Y, Gao J, He P (2017) Heart disease classification based on feature fusion. In: 2017 International conference on machine learning and cybernetics (ICMLC), Ningbo, pp 111–117

Download references

Author information



Corresponding author

Correspondence to Yubo Yuan.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by V. Loia.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Luo, J., Yan, H. & Yuan, Y. Risk Factors Analysis and Classification on Heart Disease. Soft Comput 24, 13167–13178 (2020). https://doi.org/10.1007/s00500-020-04731-z

Download citation


  • Canonical correlation analysis
  • Heart disease
  • Functional classification
  • Risk factors analysis
  • Group factors