A Study of Features Affecting on Stroke Prediction Using Machine Learning

  • Panida SongramEmail author
  • Chatklaw Jareanpon
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11909)


In 2021, Thailand will become an ageing society. The policy of the health of older people is a challenging task for the Thai government that has to be carefully planned. Stroke is the first leading cause of death of older people in Thailand. Knowing the risk factors for stroke will help people to prevent stroke. In this paper, features affecting stroke are studied based on machine learning. Factors and diseases occurring before stroke are studied as features to detect stroke and find affective factors of stroke. The detection of stroke is investigated based on learning classifiers, SVM, Naïve Bayes, KNN, and decision tree. Moreover, Chi2 is adopted to find affective factors of stroke. The four most affective factors of stroke are focused to know the risk of stroke. From the study, we can see that the factors are more affective than the diseases for detecting stroke and decision tree is the best classifier. Decision tree gives 72.10% of accuracy and 74.29% of F-measure. The factors affecting stroke are smoking, alcohol, cholesterol, blood pressure, sex, exercise, and occupation. Moreover, we found that no smoking can avoid stroke. Drinking alcohol, abnormal cholesterol, and abnormal blood pressure raise the risk of a stroke.


Stroke prediction Stroke classification Risk factors for stroke Affective factors of stroke 


  1. 1.
    Ageing Population in Thailand. Accessed 10 Jan 2019
  2. 2.
    Zhang, X.-F., Attia, J., D’Este, C., Yu, X.-H., Wu, X.-G.: A risk score predicted coronary heart disease and stroke in a chinese cohort. J. Clin. Epidemiol. 58(9), 951–958 (2005)CrossRefGoogle Scholar
  3. 3.
    Chawla, M., Sharma, S., Sivaswamy, J., Kishore, L.T.: A method for automatic detection and classification of stroke from brain CT images. In: Proceeding of 31st Annual International Conference of IEEE Engineering in Medicine and Biology Society (EMBC), USA, pp. 3581–3584. IEEE (2009)Google Scholar
  4. 4.
    Khosla, A., Cao, Y., Chiung-Yu Lin, C., Chiu, H.-K., Hu, J., Lee, H.: An integrated machine learning approach to stroke prediction. In: Proceeding of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, USA, pp. 183–192 (2010)Google Scholar
  5. 5.
    Gillebert, C.R., Humphreys, G.W., Mantini, D.: Automated delineation of stroke lesions using brain CT images. NeuroImage Clin. 4(C), 540–548 (2014)CrossRefGoogle Scholar
  6. 6.
    Kansadub, T., Thammaboosadee, S., Kiattisin, S., Jalayondeja C.: Stroke risk prediction model based on demographic data. In: Proceeding of the 8th International Conference on Biomedical Engineering, Thailand, pp. 1–3 (2015)Google Scholar
  7. 7.
    Mcheick, H., Nasser, H., Dbouk, M., Nasser, A.: Stroke prediction context-aware health care system. In: Proceeding of International Conference on Connected Health: Applications, Systems and Engineering Technologies, USA, pp. 30–35 (2016)Google Scholar
  8. 8.
    Singh, M.S., Choudhary, P.: Stroke prediction using artificial intelligence. In: 8th Annual Industrial Automation and Electromechanical Engineering Conference, Thailand, pp. 158–161 (2017)Google Scholar
  9. 9.
    World Health Organization: ICD-10 International Statistical Classification of Diseases and Related Health Problems, 2nd edn. World Health Organization, Geneva, Switzerland (2006)Google Scholar
  10. 10.
    Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)zbMATHGoogle Scholar
  11. 11.
    Guo, G., Wang, H., Bell, D., Bi, Y., Greer, K.: KNN model-based approach in classification. In: Meersman, R., Tari, Z., Schmidt, D.C. (eds.) OTM 2003. LNCS, vol. 2888, pp. 986–996. Springer, Heidelberg (2003). Scholar
  12. 12.
    John, G., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Proceeding of the 11th International Conference on Uncertainty in Artificial Intelligence, San Mateo, pp. 338–345 (1995)Google Scholar
  13. 13.
    Quinlan, J.R.: Introduction of decision trees. Mach. Learn. 1(1), 81–106 (1986)Google Scholar
  14. 14.
    Quinlan, J.R.: C4.5: Programs for Machine Learning, 1st edn. Morgan Kaufmann, San Francisco (1993)Google Scholar
  15. 15.
    Liu, H., Setiono R.: Chi2: feature selection and discretization of numeric attributes. In: Proceedings of the Seventh International Conference on Tools with Artificial Intelligence, pp. 388–391. IEEE Computer Society, USA (1995)Google Scholar
  16. 16.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Polar Lab, Department of Computer Science, Faculty of InformaticsMahasarakham UniversityMahasarakhamThailand

Personalised recommendations