A novel approach for disease comorbidity prediction using weighted association rule mining

  • K. S. LakshmiEmail author
  • G. Vadivu
Original Research


Disease comorbidity prediction has gained the attention of many researchers during the past years. Bulk creation of clinical data in the form of electronic health records (EHRs) and biological data opened the door to explore disease associations and comorbidity patterns. This led to the development of analytical tools for the detection of disease comorbidities and analysis of their causal genetic source. Comorbidity prediction using statistical analyis, data mining and network analysis have made significant contributions in medical field. Combining multiscalar data proved to have enhanced performance in disease comorbidity prediction techniques. Here we present a novel approach based on weighted association rule mining for predicting disease comorbidities using clinical data and molecular data. Results demonstrated that the system outperformed existing systems in disease comorbidity prediction.


Disease comorbidity Gene ontology Disease ontology Protein protein interaction Pathway interaction Association rule mining 



  1. Ahmadi E, Weckman G, Masel D (2018) Decision making model to predict presence of coronary artery disease using neural network and c5.0 decision tree. J Ambient Intell Hum Comput 9:999. CrossRefGoogle Scholar
  2. Bagley S, Sirota M, Chen R, Butte A, Altman R (2016) Constraints on biological mechanism from disease comorbidity using electronic medical records and database of genetic variants. PLoS Comput Biol. Google Scholar
  3. Boytcheva S, Angelova G, Angelov Z, Tcharaktchiev D (2017) Mining comorbidity patterns using retrospective analysis of big collection of outpatient records. Health Inform Sci Syst 5(1):3. CrossRefGoogle Scholar
  4. Cai CH, Fu AW-C, Cheng CH, Kwong WW (1998) Mining association rules with weighted items. In: Proceedings. IDEAS’98. International database engineering and applications symposium. Cardiff, Wales, UK, pp 68–77Google Scholar
  5. Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG (2001) A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform 34(5):301–310CrossRefGoogle Scholar
  6. Chen D, Tian J, Yao Y, Du S, Gao J, Guo R, Wei Y, Lu P (2016) Recognition of disease comorbidity medication patterns based on network motif analysis. Res Rev J Pharm Pharm Sci 5(3):1–12Google Scholar
  7. Chen Y, Xu R (2014) Mining cancer-specific disease comorbidities from a large observational health database. Cancer Inform 13:37–44Google Scholar
  8. Chen Y, Li L, Xu R (2015) Disease comorbidity network guides the detection of molecular evidence for the link between colorectal cancer and obesity. AMIA Jt Summits Transl Sci Proc 2015:201–206Google Scholar
  9. Davis A, Grondin C, Johnson R, Sciaky D, King B, McMorran R, Wiegers J, Wiegers T, Mattingly C (2017) The comparative toxicogenomics database: update 2017. Nucleic Acids Res. 45(D1):D972–D978. CrossRefGoogle Scholar
  10. Folino F, Pizzuti C (2010) A comorbidity-based recommendation enginefor disease prediction. In: IEEE international symposium on computer-based medical systems (CBMS). Bentley, Australia, pp 6–12Google Scholar
  11. Ganesan V, Waheeta Hopper S, BharatRam G (2011) Semantic data integration and querying using SWRL. In: Wyld DC, Wozniak M, Chaki N, Meghanathan N, Nagamalai D (eds) Trends in network and communications. WeST 2011, NeCoM 2011, WiMoN 2011. Communications in computer and information science, vol 197. Springer, Berlin, HeidelbergGoogle Scholar
  12. Gomez-Cabrero D, Menche J, Vargas C, Cano I, Maier D, Barabsi AL, Tegnr J, Roca J (2016) From comorbidities of chronic obstructive pulmonary disease to identification of shared molecular mechanisms by data integration. BMC Bioinform.
  13. Gutierrez-Sacristan A, Bravo A, Giannoula A, Mayer MA, Sanz F, Furlong LI (2018) comorbidity: an r package for the systematic analysis of disease comorbidities. Bioinformatics 34(18):3228–3230CrossRefGoogle Scholar
  14. He F, Zhu G, Wang YY, Zhao XM, Huang DS (2017) PCID: A novel approach for predicting disease comorbidity by integrating multi-scale data. IEEE/ACM Trans Comput Biol Bioinform 14(3):678–686. CrossRefGoogle Scholar
  15. Ji X, Ae Chun S, Geller J (2016) Predicting comorbid conditions and trajectories using social health records. IEEE Trans Nanobioscience 15(4):371–379CrossRefGoogle Scholar
  16. Jones R (2010) Chronic disease and comorbidity. Br J Gen Pract.
  17. Kerrien S, Alam-Faruque Y, Aranda B, Bancarz I, Bridge A, Derow C, Dimmer E, Feuermann M, Friedrichsen A, Huntley R, Kohler C, Khadake J, Leroy C, Liban A, Lieftink C, Montecchi-Palazzi L, Orchard S, Risse J, Robbe K, Roechert B, Thorneycroft D, Zhang Y, Apweiler R, Hermjakob H (2007) Intact-open source resource for molecular interaction data. Nucleic Acids Res 35(Database issue):D561–D565CrossRefGoogle Scholar
  18. Ko Y, Cho M, Lee JS, Kim J (2016) Identification of disease comorbidity through hidden molecular mechanisms. Sci Rep 6:39433. CrossRefGoogle Scholar
  19. Koh Y, Pears R, Yeap W (2010) Valency based weighted association rule mining. Adv Knowl Discov Data Mining Lecture Notes Comput Sci 6118:274–285Google Scholar
  20. Lakshmi KS, Vadivu G (2017) Extracting association rules from medical health records using multi-criteria decision analysis. Procedia Comput Sci 115:290–295CrossRefGoogle Scholar
  21. Lan GC, Hong TP, STseng V (2010) Mining high transaction-weighted utility itemsets. Second Int Conf Comput Eng Appl 1:314–318Google Scholar
  22. Liberzon A (2014) A description of the molecular signatures database (msigdb) web site. Methods Mol Biol 1150:153–60CrossRefGoogle Scholar
  23. Licata L, Briganti L, Peluso D, Perfetto L, Iannuccelli M, EGaleota, Sacco F, Palma A, Nardozza AP, Santonico E, Castagnoli L, Cesareni G (2012) MINT, the molecular interaction database:2012 update. Nucleic Acids Res 40(Database issue):D857–D861. CrossRefGoogle Scholar
  24. Mathur S, Dinakarpandian D (2012) Finding disease similarity based on implicit semantic similarity. J Biomed Inform 45(2):363–371. CrossRefGoogle Scholar
  25. Moni MA, Li P (2014) comoR: a software for disease comorbidity risk assessment. J Clin Bioinform 4:8CrossRefGoogle Scholar
  26. Moni MA, Xu H, Lio P (2015) Cytocom: a cytoscape app to visualize, query and analyse disease comorbidity networks. Bioinformatics 31(6):969–71CrossRefGoogle Scholar
  27. Ojeme B, Mbogho A (2016) Selecting learning algorithms for simultaneous identification of depression and comorbid disorders. Proc Comput Sci 96:1294–1303. CrossRefGoogle Scholar
  28. Park J, Lee DS, Christakis NA, Barabási AL (2009) The impact of cellular networks on disease comorbidity. Mol Syst Biol 5:262. CrossRefGoogle Scholar
  29. Piero J, Bravo À, Queralt-Rosinach N, Gutirrez-Sacristn A, Deu-Pons J, Centeno E, García-García J, Sanz F, Furlong LI (2017) DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res 45(D1):D833–D839. CrossRefGoogle Scholar
  30. Pletscher-Frankild S, Pallej A, Tsafou K, Binder JX, Jensen LJ (2015) Diseases: text mining and data integration of disease-gene associations. Methods 74:83–9CrossRefGoogle Scholar
  31. Prasad TSK, Goel R, Kandasamy K, Keerthikumar S, Kumar S, Mathivanan S, Telikicherla D, Raju R, Shafreen B, Venugopal A, Balakrishnan L, Marimuthu A, Banerjee S, Somanathan DS, Sebastian A, Rani S, Ray S, Kishore CJH, Kanth S, Ahmed M, Kashyap MK, Mohmood R, Ramachandra YL, Krishna V, Rahiman BA, Mohan S, Ranganathan P, Ramabadran S, Chaerkady R, Pandey A (2009) Human protein reference database—2009 update. Nucleic Acids Res 37(Database issue): D767–D772.CrossRefGoogle Scholar
  32. Rhonda K, Littenberg B, Chen ES (2012) Exploring generalized association rule mining for disease co-occurrences. AMIA Annu Symp Proc 2012: 1284–1293Google Scholar
  33. Rubio-Perez C, Guney E, Aguilar D, Piero J, Garcia-Garcia J, Iadarola B, Sanz F, Fernandez-Fuentes N, Furlong LI, Oliva B (2017) Genetic and functional characterization of disease associations explains comorbidity. Sci Rep. Google Scholar
  34. Roque FS, Jensen PB, Schmock H, Dalgaard M, Andreatta M, Hansen T, Søeby K, Bredkjær S, Juul A, Werge T, Jensen LJ, Brunak S (2011) Using electronic patient records to discover disease correlations and stratify patient cohorts. PLoS Comput Biol 7(8):e1002141. CrossRefGoogle Scholar
  35. Tambe S, Gajre S (2018) Cluster-based real-time analysis of mobile healthcare application for prediction of physiological data. J Ambient Intell Hum Comput 9:429. CrossRefGoogle Scholar
  36. Vadivu G, Hopper S (2012) Ontology mapping of indian medicinal plants with standardized medical terms. J Comput Sci 8(9):1576–1584. CrossRefGoogle Scholar
  37. Vadivu G, Waheeta Hopper S (2010) Semantic linking and querying of natural food, chemicals and diseases. Int J Comput Appl 11(4):35–38Google Scholar
  38. Vadivu G, Swaminathan R, Thenmozhi M (2012) Similarity measure based on edge counting using ontology. Int J Eng Res Dev 3:40–44Google Scholar
  39. Wright A, Chen ES, Maloney FL (2010) An automated technique for identifying associations between medications, laboratory results and problems. J Biomed Inform 43(6):891–901. CrossRefGoogle Scholar
  40. Yu G, Li F, Qin Y, Bo X, Wu Y, Wang S (2010) Gosemsim: an r package for measuring semantic similarity among go terms and gene products. Bioinformatics 26(7):976–978. CrossRefGoogle Scholar
  41. Zhou J, Quan FB (2018) The research on gene-disease association based on text-mining of pubmed. BMC Bioinform 19:37CrossRefGoogle Scholar
  42. Zhu F, Patumcharoenpol P, Zhang C, Yang Y, Chan J, Meechai A, Vongsangnak W, Shen B (2013) Biomedical text mining and its applications in cancer research. J Biomed Inform 46:200–211CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Information TechnologyRajagiri School of Engineering & TechnologyErnakulamIndia
  2. 2.Department of Information TechnologySRM Institute of Science and TechnologyChennaiIndia

Personalised recommendations