Genetic Algorithm Based Methods for Identification of Health Risk Factors Aimed at Preventing Metabolic Syndrome
In recent years, metabolic syndrome has emerged as a major health concern because it increases the risk of developing lifestyle diseases, such as diabetes, hypertension, and cardiovascular disease. Some of the symptoms of the metabolic syndrome are high blood pressure, decreased HDL cholesterol, and elevated triglycerides (TG). To prevent the developing of metabolic syndrome, accurate prediction of the future values of these health risk factors and identification of other factors from the health checkup and lifestyle data, which are highly related with these risk factors, are very important. In this paper, we propose a new framework, based on genetic algorithm and its variants, for identifying those important health factors and predicting the future health risk of a person with high accuracy. We show the effectiveness of the proposed system by applying it to the health checkup and lifestyle data of Toshiba Corporation.
KeywordsFeature selection classification unbalanced data metabolic syndrome fitness evaluation RPMBGA+ AUC balanced
Unable to display preview. Download preview PDF.
- 1.MedlinePlus: Metabolic syndrome [Online accessed June 27, 2008] (2008), http://www.nlm.nih.gov/medlineplus/metabolicsyndrome.html
- 3.Holland, J.: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor (1975)Google Scholar
- 5.Paul, T.K., Iba, H.: Prediction of cancer class with majority voting genetic programming classifier using gene expression data. IEEE/ACM Transactions on Computational Biology and Bioinformatics (23 August 2007); Preprint on IEEE Computer Society Digital Library, June 11 (2008)Google Scholar
- 6.Paul, T.K., Hasegawa, Y., Iba, H.: Classification of gene expression data by majority voting genetic programming classifier. In: Proceedings of the 2006 IEEE WCCI, Vancouver, BC, Canada, pp. 8690–8697 (2006)Google Scholar
- 8.Paul, T.K., Iba, H.: Identification of informative genes for molecular classification using probabilistic model building genetic algorithm. In: Proceedings of Genetic and Evolutionary Computation Conference 2004, pp. 414–425 (2004)Google Scholar
- 10.Wang, L., Chu, F., Xie, W.: Accurate cancer classification using expressions of very few genes. IEEE/ACM Transactions on Computational Biology and Bioinformatics 4(1) (2007)Google Scholar
- 11.Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufman Publishers, San Francisco (1993)Google Scholar
- 13.Tan, K.C., Tay, A., Lee, T.H., Heng, C.M.: Mining multiple comprehensible classification rules using genetic programming. In: Proceedings of the 2002 Congress on Evolutionary Computation, Washington, DC, USA, pp. 1302–1307 (2002)Google Scholar
- 14.Alfaro-Cid, E., Sharman, K., Esparcia-Alcàzar, A.I.: A genetic programming approach for bankruptcy prediction using a highly unbalanced database. In: Giacobini, M. (ed.) EvoWorkshops 2007. LNCS, vol. 4448, pp. 169–178. Springer, Heidelberg (2007)Google Scholar
- 15.Pelikan, M., Goldberg, D., Lobo, F.: A survey of optimizations by building and using probabilistic models. Technical Report, Illigal Report 99018, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, USA (1999)Google Scholar
- 16.Paul, T.K., Ueno, K., Iwata, K., Hayashi, T., Honda, N.: Risk prediction and risk factors identification from imbalanced data with rpmbga+. In: GECCO 2008: Proceedings of the 2008 GECCO conference companion on Genetic and evolutionary computation, pp. 2193–2198. ACM, New York (2008)CrossRefGoogle Scholar
- 17.Baluja, S.: Population-based incremental learning: A method for integrating genetic search based function optimization and competitive learning. Technical Report CMU-CS-94-163, Carnegie Mellon University, Pittsburgh, Pennsylvania (1994)Google Scholar