A Systematic Review of Applications of Machine Learning in Cancer Prediction and Diagnosis


Advancement in genome sequencing technology has empowered researchers to think beyond their imagination. Researchers are trying their hard to fight against various genetic diseases such as cancer. Artificial intelligence has empowered research in the healthcare sector. The availability of open-source healthcare datasets has motivated the researchers to develop applications which helps in early diagnosis and prognosis of diseases. Further, Next-generation sequencing has helped to look into detailed intricacies of biological systems. It has provided an efficient and cost-effective approach with higher accuracy. The advent of microRNAs also known as small noncoding genes has begun the paradigm shift in oncological research. We are now able to profile expression profiles of RNAs using RNA-seq data. microRNA profiling has helped in uncovering their relationship in various genetic and biological processes. Here in this paper, we present a review of the machine learning perspective in cancer research. The best way to develop effective cancer treatment/drugs is to better understand the intricacies and complexities involved in the cancer microenvironment. Although there has been a plethora of methods and techniques proposed in the literature, still the deadliness of cancer can't be reduced. In such a situation Artificial intelligence (AI) or machine learning is providing a reliable, fast, and efficient way to deal with such stringent diseases.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6


  1. 1.

    Błaszczyński J, Stefanowski J (2015) Neighbourhood sampling in bagging for imbalanced data. Neurocomputing 150:529–542

    Article  Google Scholar 

  2. 2.

    Ying Lu, Han J (2003) Cancer classification using gene expression data. Inf Syst 28(4):243–268

    MathSciNet  MATH  Article  Google Scholar 

  3. 3.

    Oleg O (2013) Survey of novel feature selection methods for cancer classification. Biological knowledge discovery handbook preprocessing mining, and postprocessing of biological data, pp 379–398

  4. 4.

    Golub Todd R, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H (1999) Molecular classification of cancer class discovery and class prediction by gene expression monitoring. Science 286(5439):531–537

    Article  Google Scholar 

  5. 5.

    Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci 96(12):6745–6750

    Article  Google Scholar 

  6. 6.

    Van’t Veer LJ, Dai H, Van De Vijver MJ, He YD, Hart AA, Mao M, Friend SH (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871):530–536

    Article  Google Scholar 

  7. 7.

    Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1–3):389–422

    MATH  Article  Google Scholar 

  8. 8.

    Shevade SK, Sathiya Keerthi S (2003) A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinf 19(17):2246–2253

    Article  Google Scholar 

  9. 9.

    Furlanello C, Serafini M, Merler S, Jurman G (2003) Gene selection and classification by entropy-based recursive feature elimination. In: Proceedings of the international joint conference on neural networks, 4:3077–3082. IEEE

  10. 10.

    Chu W, Ghahramani Z, Falciani F, Wild DL (2005) Biomarker discovery in microarray gene expression data with Gaussian processes. Bioinf 21(16):3385–3393

    Article  Google Scholar 

  11. 11.

    Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinf 23(19):2507–2517

    Article  Google Scholar 

  12. 12.

    Inza I, Larrañaga P, Blanco R, Cerrolaza AJ (2004) Filter versus wrapper gene selection approaches in DNA microarray domains. Artif Intell Med 31(2):91–103

    Article  Google Scholar 

  13. 13.

    Shen Qi, Shi W-M, Kong W (2008) Hybrid particle swarm optimization and tabu search approach for selecting genes for tumor classification using gene expression data. Comput Biol Chem 32(1):53–60

    MATH  Article  Google Scholar 

  14. 14.

    Li S, Xixian Wu, Tan M (2008) Gene selection using hybrid particle swarm optimization and genetic algorithm. Soft Comput 12(11):1039–1048

    Article  Google Scholar 

  15. 15.

    Branke J, Deb K, Dierolf H, Osswald M (2004) Finding knees in multi-objective optimization. International conference on parallel problem solving from nature. Springer, Berlin, Heidelberg, pp 722–731

    Google Scholar 

  16. 16.

    Marler RT, Arora JS (2004) Survey of multi-objective optimization methods for engineering. Struct Multidiscip Optim 26(6):369–395

    MathSciNet  MATH  Article  Google Scholar 

  17. 17.

    BoussaïD I, Lepagnot J, Siarry P (2013) A survey on optimization metaheuristics. Inf Sci 237:82–117

    MathSciNet  MATH  Article  Google Scholar 

  18. 18.

    Chakraborty A, Kar AK (2017) Swarm intelligence: a review of algorithms. In: Nature-inspired computing and optimization. Springer, pp 475–494

  19. 19.

    Weinberg RA (1991) Tumor suppressor genes. Science 254(5035):1138–1146

    Article  Google Scholar 

  20. 20.

    Knoechel B, Roderick JE, Williamson KE, Zhu J, Lohr JG, Cotton MJ, Gillespie SM (2014) An epigenetic mechanism of resistance to targeted therapy in T cell acute lymphoblastic leukemi. Nat Genet 46(4):364–370

    Article  Google Scholar 

  21. 21.

    Rini BI, Atkins MB (2009) Resistance to targeted therapy in renal-cell carcinoma. Lancet Oncol 10(10):992–1000

    Article  Google Scholar 

  22. 22.

    Housman G, Byler S, Heerboth S, Lapinska K, Longacre M, Snyder N, Sarkar S (2014) Drug resistance in cancer an overview. Cancers 6(3):1769–1792

    Article  Google Scholar 

  23. 23.

    Fitzgerald JB, Schoeberl B, Nielsen UB, Sorger PK (2006) Systems biology and combination therapy in the quest for clinical efficacy. Nat Chem Biol 2(9):458–466

    Article  Google Scholar 

  24. 24.

    Cokol M, Chua HN, Tasan M, Mutlu B, Weinstein ZB, Suzuki Yo, Nergiz ME (2011) Systematic exploration of synergistic drug pairs. Mol Syst Biol 7(1):544–553

    Article  Google Scholar 

  25. 25.

    Tallarida RJ (2011) Quantitative methods for assessing drug synergism. Genes Cancer 2(11):1003–1008

    Article  Google Scholar 

  26. 26.

    Ashton JC (2015) Drug combination studies and their synergy quantification using the Chou-Talalay method. Cancer Res 75(11):2400–2400

    Article  Google Scholar 

  27. 27.

    Foucquier J, Guedj M (2015) Analysis of drug combinations current methodological landscape. Pharmacol Res Perspect 3(3):00149

    Article  Google Scholar 

  28. 28.

    Kotelnikova E, Yuryev A, Mazo I, Daraselia N (2010) Computational approaches for drug repositioning and combination therapy design. J Bioinf Comput Biol 8(3):593–606

    Article  Google Scholar 

  29. 29.

    Xiao G, Ma S, Minna J, Xie Y (2014) Adaptive prediction model in prospective molecular signature-based clinical studies . Clin Cancer Res 20(3):531–539

    Article  Google Scholar 

  30. 30.

    Garnett MJ, Edelman EJ, Heidorn SJ, Greenman CD, Dastur A, Lau KW, Greninger P (2012) Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 483(7391):570–575

    Article  Google Scholar 

  31. 31.

    Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, Wilson CJ (2012) The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483(7391):603–607

    Article  Google Scholar 

  32. 32.

    Yamada M, Lian W, Goyal A, Chen J, Wimalawarne K, Khan SA, Chang Y (2017) Convex factorization machine for toxicogenomics prediction. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1215–1224

  33. 33.

    Jamali M, Ester M (2010) A matrix factorization technique with trust propagation for recommendation in social networks. In: Proceedings of the fourth ACM conference on Recommender systems, pp 135–142

  34. 34.

    Wang L, Li X, Zhang L, Gao Q (2017) Improved anticancer drug response prediction in cell lines using matrix factorization with similarity regularization. BMC Cancer 17(1):513–524

    Article  Google Scholar 

  35. 35.

    Evans WE, McLeod HL (2003) Pharmacogenomics drug disposition, drug targets, and side effects. N Engl J Med 348(6):538–549

    Article  Google Scholar 

  36. 36.

    Wei D-Q, Wang J-F, Chen C, Li Y, Chou K-C (2008) Molecular modeling of two CYP2C19 SNPs and its implications for personalized drug design. Protein Pept Lett 15(1):27–32

    Article  Google Scholar 

  37. 37.

    Mizutani S, Pauwels E, Stoven V, Goto S, Yamanishi Y (2012) Relating drug–protein interaction network with drug side effects. Bioinformatics 28(18):i522–i528

    Article  Google Scholar 

  38. 38.

    Rarey M, Kramer B, Lengauer T, Klebe G (1996) A fast flexible docking method using an incremental construction algorithm. J Mol Biol 261(3):470–489

    Article  Google Scholar 

  39. 39.

    Xie Li, Evangelidis T, Xie L, Bourne PE (2011) Drug discovery using chemical systems biology weak inhibition of multiple kinases may contribute to the anti-cancer effect of nelfinavir. PLoS Comput Biol 7(4):e1002037

    Article  Google Scholar 

  40. 40.

    Jacob L, Vert J-P (2008) Protein-ligand interaction prediction an improved chemogenomics approach. Bioinformatics 24(19):2149–2156

    Article  Google Scholar 

  41. 41.

    Zhu S, Okuno Y, Tsujimoto G, Mamitsuka H (2005) A probabilistic model for mining implicit chemical compound–gene relations from literature. Bioinformatics 21(2):ii245–ii251

    Google Scholar 

  42. 42.

    Yamanishi Y, Araki M, Gutteridge A, Honda W, Kanehisa M (2008) Prediction of drug–target interaction networks from the integration of chemical and genomic spaces. Bioinformatics 24(13):i232–i240

    Article  Google Scholar 

  43. 43.

    Wang Y-C, Zhang C-H, Deng N-Y, Wang Y (2011) Kernel-based data fusion improves the drug–protein interaction prediction. Comput Biol Chem 35(6):353–362

    MathSciNet  Article  Google Scholar 

  44. 44.

    Fakhraei S, Huang B, Raschid L, Getoor L (2014) Network-based drug-target interaction prediction with probabilistic soft logic. IEEE/ACM Trans Comput Biol Bioinf (TCBB) 11(5):775–787

    Article  Google Scholar 

  45. 45.

    van Laarhoven T, Nabuurs SB, Marchiori E (2011) Gaussian interaction profile kernels for predicting drug–target interaction. Bioinformatics 27(21):3036–3043

    Article  Google Scholar 

  46. 46.

    Zheng X, Ding H, Mamitsuka H, Zhu S (2013) Collaborative matrix factorization with multiple similarities for predicting drug-target interactions. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1025–1033

  47. 47.

    Bolton EE, Wang Y, Thiessen PA, Bryant SH (2008) PubChem integrated platform of small molecules and biological activities. Ann Rep Comput Chem 4:217–241

    Article  Google Scholar 

  48. 48.

    Knox C, Law V, Jewison T, Liu P, Ly S, Frolkis A, Pon A (2010) DrugBank 30 a comprehensive resource for ‘omics’ research on drugs. Nucleic Acids Res 39(1):D1035–D1041

    Google Scholar 

  49. 49.

    Gaulton A, Bellis LJ, Patricia Bento A, Chambers J, Davies M, Hersey A, Light Y (2011) CHEMBL a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40(D1):D1100–D1107

    Article  Google Scholar 

  50. 50.

    Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M (2011) KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res 40(D1):D109–D114

    Article  Google Scholar 

  51. 51.

    Hira ZM, Gillies DF (2015) A review of feature selection and feature extraction methods applied on microarray data. Adv Bioinf 2015(198363):1–13

    Article  Google Scholar 

  52. 52.

    Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2016) Feature selection for high-dimensional data. Prog Artif Intell 5(2):65–75

    Article  Google Scholar 

  53. 53.

    Haixiang G, Yijing Li, Jennifer Shang Gu, Mingyun HY, Bing G (2017) Learning from class-imbalanced data: review of methods and applications. Expert Syst Appl 73:220–239

    Article  Google Scholar 

  54. 54.

    Krawczyk B, Galar M, Jeleń Ł (2016) Francisco Herrera Evolutionary undersampling boosting for imbalanced classification of breast cancer malignancy. Appl Soft Comput 38:714–726

    Article  Google Scholar 

  55. 55.

    Dagogo-Jack I, Shaw AT (2018) Tumour heterogeneity and resistance to cancer therapies. Nat Rev Clin Oncol 15(2):81–94

    Article  Google Scholar 

  56. 56.

    Tadist K, Najah S, Nikolov NS, Mrabti F, Zahi A (2019) Feature selection methods and genomic big data: a systematic review. J Big Data 6(79):1–24

    Google Scholar 

  57. 57.

    Bartel DP (2009) Micrornas: target recognition and regulatory functions. Cell 136:215–233

    Article  Google Scholar 

  58. 58.

    Lee Y, Kim M, Han J, Yeom K-H, Lee S, Baek SH, Narry Kim V (2004) Microrna genes are transcribed by rna polymerase ii. EMBO J 23(20):4051–4060

    Article  Google Scholar 

  59. 59.

    Xie Z, Allen E, Fahlgren N, Calamar A, Givan SA, Carrington JC (2005) Expression of arabidopsis mirna genes. Plant Physiol 138(4):2145–2154

    Article  Google Scholar 

  60. 60.

    Richard Lu, Barca O (2012) Fine-tuning oligodendrocyte development by micrornas. Front Neurosci 6:13

    Google Scholar 

  61. 61.

    Hayes DF, Bast RC, Desch CE, Fritsche H, Kemeny NE, Jessup JM, Locker GY, Macdonald JS, Mennel RG, Norton L et al (1996) Tumor marker utility grading system: a framework to evaluate clinical utility of tumor markers. J Natl Cancer Inst 88(20):1456–1466

    Article  Google Scholar 

  62. 62.

    Garzon R, Marcucci G, Croce CM (2010) Targeting micrornas in cancer: rationale, strategies and challenges. Nat Rev Drug Discovery 9(10):775–789

    Article  Google Scholar 

  63. 63.

    Ambros V (2003) Microrna pathways in flies and worms: growth, death, fat, stress, and timing. Cell 113(6):673–676

    Article  Google Scholar 

  64. 64.

    Doench JG, Sharp PA (2004) Specificity of microrna target selection in translational repression. Genes Dev 18(5):504–511

    Article  Google Scholar 

  65. 65.

    Zhang H, Kolb FA, Brondani V, Billy E, Filipowicz W (2002) Human dicer preferentially cleaves dsrnas at their termini without a requirement for atp. EMBO J 21(21):5875–5885

    Article  Google Scholar 

  66. 66.

    Kosaka N, Iguchi H, Ochiya T (2010) Circulating microrna in body uid: a new potential biomarker for cancer diagnosis and prognosis. Cancer Sci 101(10):2087–2092

    Article  Google Scholar 

  67. 67.

    Ploussard G, de la Taille A (2010) Urine biomarkers in prostate cancer. Nat Rev Urol 7(2):101–109

    Article  Google Scholar 

  68. 68.

    Li A, Omura N, Hong S-M, Vincent A, Walter K, Grith M, Borges M, Goggins M (2010) Pancreatic cancers epigenetically silence sip1 and hypomethylate and overexpress mir-200a/200b in association with elevated circulating mir-200a and mir-200b levels. Cancer Res 70(13):5226–5237

    Article  Google Scholar 

  69. 69.

    Ho AS, Huang X, Cao H, Christman-Skieller C, Bennewith K, Le Q-T, Koong AC (2010) Circulating mir-210 as a novel hypoxia marker in pancreatic cancer. Transl Oncol 3(2):109–113

    Article  Google Scholar 

  70. 70.

    Wang J, Chen J, Chang P, LeBlanc A, Li D, Abbruzzesse JL, Frazier ML, Killary AM, Sen S (2009) Micrornas in plasma of pancreatic ductal adenocarcinoma patients as novel blood-based biomarkers of disease. Cancer Prev Res 2(9):807–813

    Article  Google Scholar 

  71. 71.

    Morimura R, Komatsu S, Ichikawa D, Takeshita H, Tsujiura M, Nagata H, Konishi H, Shiozaki A, Ikoma H, Okamoto K et al (2011) Novel diagnostic value of circulating mir-18a in plasma of patients with pancreatic cancer. Br J Cancer 105(11):1733–1740

    Article  Google Scholar 

  72. 72.

    Mitchell PS, Parkin RK, Kroh EM, Fritz BR, Wyman SK, Pogosova- EL, Agadjanyan AP, Noteboom J, O’Briant KC, Allen A et al (2008) Circulating micrornas as stable blood-based markers for cancer detection. Proc Natl Acad Sci 105(30):10513–10518

    Article  Google Scholar 

  73. 73.

    Zhu W, Qin W, Atasoy U, Sauter ER (2009) Circulating micrornas in breast cancer and healthy subjects. BMC Res Notes 2(1):89

    Article  Google Scholar 

  74. 74.

    Heneghan HM, Miller N, Kelly R, Newell J, Kerin MJ (2010) Systemic mirna-195 differentiates breast cancer from other malignancies and is a potential biomarker for detecting noninvasive and early stage disease. Oncologist 15(7):673–682

    Article  Google Scholar 

  75. 75.

    Asaga S, Kuo C, Nguyen T, Terpenning M, Giuliano AE, Hoon DSB (2011) Direct serum assay for microrna-21 concentrations in early and advanced breast cancer. Clin Chem 57(1):84–91

    Article  Google Scholar 

  76. 76.

    Zhao H, Shen J, Medico L, Wang D, Ambrosone CB, Liu S (2010) A pilot study of circulating mirnas as potential biomarkers of early stage breast cancer. PLoS ONE 5(10):e13735

    Article  Google Scholar 

  77. 77.

    Chen Xi, Ba Yi, Ma L, Cai X, Yin Y, Wang K, Guo J, Zhang Y, Chen J, Guo X et al (2008) Characterization of micrornas in serum: a novel class of biomarkers for diagnosis of cancer and other diseases. Cell Res 18(10):997–1006

    Article  Google Scholar 

  78. 78.

    Shen J, Liu Z, Todd NW, Zhang H, Liao J, Lei Yu, Guarnera MA, Li R, Cai L, Zhan M et al (2011) Diagnosis of lung cancer in individuals with solitary pulmonary nodules by plasma microrna biomarkers. BMC Cancer 11(1):1

    Article  Google Scholar 

  79. 79.

    Zheng D, Haddadin S, Wang Y, Li-Qun Gu, Perry MC, Freter CE, Wang MX (2011) Plasma micrornas as novel biomarkers for early detection of lung cancer. Int J Clin Exp Pathol 4(6):575–586

    Google Scholar 

  80. 80.

    Taylor DD, Gercel-Taylor C (2008) Microrna signatures of tumor-derived exosomes as diagnostic biomarkers of ovarian cancer. Gynecol Oncol 110(1):13–21

    Article  Google Scholar 

  81. 81.

    Resnick KE, Alder H, Hagan JP, Richardson DL, Croce CM, Cohn DE (2009) The detection of differentially expressed micrornas from the serum of ovarian cancer patients using a novel real-time pcr platform. Gynecol Oncol 112(1):55–59

    Article  Google Scholar 

  82. 82.

    Tsujiura M, Ichikawa D, Komatsu S, Shiozaki A, Takeshita H, Kosuga T, Konishi H, Morimura R, Deguchi K, Fujiwara H et al (2010) Circulating micrornas in plasma of patients with gastric cancers. Br J Cancer 102(7):1174–1179

    Article  Google Scholar 

  83. 83.

    Li X, Luo F, Li Q, Meihua Xu, Feng D, Zhang G, Wei Wu (2011) Identification of new aberrantly expressed mirnas in intestinal-type gastric cancer and its clinical significance. Oncol Rep 26(6):1431–1439

    Google Scholar 

  84. 84.

    Yamamoto Y, Kosaka N, Tanaka M, Koizumi F, Kanai Y, Mizutani T, Murakami Y, Kuroda M, Miyajima A, Kato T et al (2009) Microrna-500 as a potential diagnostic marker for hepatocellular carcinoma. Biomarkers 14(7):529–538

    Article  Google Scholar 

  85. 85.

    Qu KZ, Zhang Ke, Li HaiRong, Afdhal NH, Albitar M (2011) Circulating micrornas as biomarkers for hepatocellular carcinoma. J Clin Gastroenterol 45(4):355–360

    Article  Google Scholar 

  86. 86.

    Zhang C, Wang C, Chen Xi, Yang C, Li Ke, Wang J, Dai J, Zhibin Hu, Zhou X, Chen L et al (2010) Expression profile of micrornas in serum: a fingerprint for esophageal squamous cell carcinoma. Clin Chem 56(12):1871–1879

    Article  Google Scholar 

  87. 87.

    Wong T-S, Liu X-B, Wong B-H, Ng R-M, Yuen A-W, Wei WI (2008) Mature mir-184 as potential oncogenic microrna of squamous cell carcinoma of tongue. Clin Cancer Res 14(9):2588–2592

    Article  Google Scholar 

  88. 88.

    Sung JJ, Chong WS, Jin H, Lam EK, Shin VY, Yu J, Poon TC, Ng SS, Ng EK (2009) 1070 Differential Expression of MicroRNAs in Plasma of Colorectal Cancer Patients: A Potential Marker for Colorectal Cancer Screening. Gastroenterol 136(5):A-165

    Google Scholar 

  89. 89.

    Huang Z, Huang D, Ni S, Peng Z, Sheng W, Xiang Du (2010) Plasma micrornas are promising novel biomarkers for early detection of colorectal cancer. Int J Cancer 127(1):118–126

    Article  Google Scholar 

  90. 90.

    Schreiber R, Mezencev R, Matyunina LV, McDonald JF (2016) Evidence for the role of microRNA 374b in acquired cisplatin resistance in pancreatic cancer cells. Cancer Gene Ther 23(8):241–245

    Article  Google Scholar 

  91. 91.

    Velagapudi SP, Cameron MD, Haga CL, Rosenberg LH, Lafitte M, Duckett DR, Phinney DG, Disney MD (2016) Design of a small molecule against an oncogenic noncoding RNA. Proc Natl Acad Sci 113(21):5898–5903

    Article  Google Scholar 

  92. 92.

    Hamam R, Ali AM, Alsaleh KA, Kassem M, Alfayez M, Aldahmash A, Alajez NM (2016) microRNA expression profiling on individual breast cancer patients identifies novel panel of circulating microRNA for early detection. Sci Rep 6(1):1–8

    Article  Google Scholar 

  93. 93.

    Rupaimoole R, Calin GA, Lopez-Berestein G, Sood AK (2016) mirna deregulation in cancer cells and the tumor microenvironment. Cancer Discov 6(3):235–246

    Article  Google Scholar 

  94. 94.

    Cantini L, Isella C, Petti C, Picco G, Chiola S, Ficarra E, Caselle M, Medico E (2015) MicroRNA–mRNA interactions underlying colorectal cancer molecular subtypes. Nat Commun 6(1):1–2

    Article  Google Scholar 

  95. 95.

    Mortazavi A, Williams BA, McCue K, Schaefier L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by rna-seq. Nat Methods 5(7):621–628

    Article  Google Scholar 

  96. 96.

    Murakami Y, Tanahashi T, Okada R, Toyoda H, Kumada T, Enomoto M, Tamori A, Kawada N, Taguchi YH, Azuma T (2014) Comparison of hepatocellular carcinoma miRNA expression profiling as evaluated by next generation sequencing and microarray. PLoS ONE 9(9):e106314

    Article  Google Scholar 

  97. 97.

    Nam J-W, Shin K-R, Han J, Yoontae Lee V, Kim N, Zhang B-T (2005) Human microrna prediction through a probabilistic co-learning model of sequence and structure. Nucleic Acids Res 33(11):3570–3581

    Article  Google Scholar 

  98. 98.

    Huang T-H, Fan B, Rothschild MF, Zhi-Liang Hu, Li K, Zhao S-H (2007) Mirfinder: an improved approach and software implementation for genome-wide fast microrna precursor scans. BMC Bioinf 8(1):1

    Article  Google Scholar 

  99. 99.

    Ng KLS, Mishra SK (2007) De novo svm classification of precursor micrornas from genomic pseudo hairpins using global and intrinsic folding measures. Bioinformatics 23(11):1321–1330

    Article  Google Scholar 

  100. 100.

    Ding J, Zhou S, Guan J (2010) Mirensvm: towards better prediction of microrna precursors using an ensemble svm classifier with multi-loop features. BMC Bioinf 11(11):1

    Google Scholar 

  101. 101.

    Xue C, Li F, He T, Liu G-P, Li Y, Zhang X (2005) Classification of real and pseudo microrna precursors using local structure-sequence features and support vector machine. BMC Bioinf 6(1):310

    Article  Google Scholar 

  102. 102.

    Ana Kozomara and Sam Griffiths-Jones (2014) mirbase: annotating high confidence micrornas using deep sequencing data. Nucleic Acids Res 42(D1):D68–D73

    Article  Google Scholar 

  103. 103.

    Seunghyun Park, Seonwoo Min, Hyunsoo Choi, and Sungroh Yoon (2016) deepmirgene: Deep neural network based precursor microrna prediction. arXiv preprint arXiv:1605.00017

  104. 104.

    Cheng S, Guo M, Wang C, Liu X, Liu Y, Xuejian Wu (2015) MiRTDL: a deep learning approach for miRNA target prediction. IEEE ACM Trans Comput Biol Bioinf 13(6):1161–1169

    Article  Google Scholar 

  105. 105.

    Nadeem MW, Ghamdi MA, Hussain M, Khan MA, Khan KM, Almotiri SH, Butt SA (2020) Brain tumor analysis empowered with deep learning: A review, taxonomy, and future challenges. Brain Sci 10(2):118

    Article  Google Scholar 

  106. 106.

    Thakur SK, Singh DP, Choudhary J (2020) Lung cancer identification: a review on detection and classification. Cancer Metastasis Rev

  107. 107.

    Sharif MI, Li JP, Naz J, Rashid I (2020) A comprehensive review on multi-organs tumor detection based on machine learning. Pattern Recognit Lett 131:30–37

    Article  Google Scholar 

  108. 108.

    Yassin NI, Omran S, El Houby EM, Allam H (2018) Machine learning techniques for breast cancer computer aided diagnosis using different image modalities: A systematic review. Comput Methods Progr Biomed 156:25–45

    Article  Google Scholar 

  109. 109.

    Chato L, Latifi S. (2017) Machine learning and deep learning techniques to predict overall survival of brain tumor patients using MRI images. In: 2017 IEEE 17th international conference on bioinformatics and bioengineering (BIBE), pp 9–14

  110. 110.

    Montazeri M, Montazeri M, Montazeri M, Beigzadeh A (2016) Machine learning models in breast cancer survival prediction. Technol Health Care 24(1):31–42

    Article  Google Scholar 

  111. 111.

    Kourou K, Exarchos TP, Exarchos KP, Karamouzis MV, Fotiadis DI (2015) Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J 13:8–17

    Article  Google Scholar 

  112. 112.

    Shen L, Tan EC (2005) Dimension reduction-based penalized logistic regression for cancer classification using microarray data. IEEE/ACM Trans Comput Biol Bioinf 2(2):166–175

    Article  Google Scholar 

  113. 113.

    Wang Y, Tetko IV, Hall MA, Frank E, Facius A, Mayer KF, Mewes HW (2005) Gene selection from microarray data for cancer classification—a machine learning approach. Comput Biol Chem 29(1):37–46

    MATH  Article  Google Scholar 

  114. 114.

    Chu F, Xie W, Wang L (2004) Gene selection and cancer classification using a fuzzy neural network. IEEE Ann Meet Fuzzy Inf Process NAFIPS 2:555–559

    Google Scholar 

  115. 115.

    Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X, Powell JI (2000) Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403(6769):503–511

    Article  Google Scholar 

  116. 116.

    Khan J, Wei JS, Ringner M, Saal LH, Ladanyi M, Westermann F, Berthold F, Schwab M, Antonescu CR, Peterson C, Meltzer PS (2001) Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nature Med 7(6):673–679

    Article  Google Scholar 

  117. 117.

    Chen X, Cheung ST, So S, Fan ST, Barry C, Higgins J, Lai KM, Ji J, Dudoit S, Ng IO, Van De Rijn M (2002) Gene expression patterns in human liver cancers. Mol Biol Cell 13(6):1929–1939

    Article  Google Scholar 

  118. 118.

    Wang L, Chu F, Xie W (2007) Accurate cancer classification using expressions of very few genes. IEEE/ACM Trans Comput Biol Bioinf 4(1):40–53

    Article  Google Scholar 

  119. 119.

    Ramaswamy S, Tamayo P, Rifkin R, Mukherjee S, Yeang CH, Angelo M, Ladd C, Reich M, Latulippe E, Mesirov JP, Poggio T (2001) Multiclass cancer diagnosis using tumor gene expression signatures. Proc Natl Acad Sci 98(26):15149–15154

    Article  Google Scholar 

  120. 120.

    Cho SB, Won HH (2007) Cancer classification using ensemble of neural networks with multiple significant gene subsets. Appl Intell 26(3):243–250

    MATH  Article  Google Scholar 

  121. 121.

    Tan TZ, Quek C, Ng GS, Razvi K (2008) Ovarian cancer diagnosis with complementary learning fuzzy neural network. Artif Intell Med 43(3):207–222

    Article  Google Scholar 

  122. 122.

    Schummer M, Ng W, Bumgarner R, Nelson P, Schummer B, Bednarski D et al (1999) Comparative hybridization of an array of 21,500 ovarian cDNAs for the discovery genes overexpressed in ovarian carcinomas. Gene 238:375–385

    Article  Google Scholar 

  123. 123.

    Petricoin EF III, Ardekani AM, Hitt BA, Levine PJ, Fusaro VA, Steinberg SM et al (2002) Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359(9306):572–577

    Article  Google Scholar 

  124. 124.

    Glaab E, Bacardit J, Garibaldi JM, Krasnogor N (2012) Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data. PLoS ONE 7(7):e39932

    Article  Google Scholar 

  125. 125.

    Singh D, Febbo P, Ross K, Jackson D, Manola J et al (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1:203–209

    Article  Google Scholar 

  126. 126.

    Shipp M, Ross K, Tamayo P, Weng A, Kutok J et al (2002) Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med 8:68–74

    Article  Google Scholar 

  127. 127.

    Chin S, Teschendorff A, Marioni J, Wang Y, Barbosa-Morais N et al (2007) High-resolution aCGH and expression profiling identifies a novel genomic subtype of ER negative breast cancer. Genome Biol 8:R215

    Article  Google Scholar 

  128. 128.

    Liu Q, Sung AH, Chen Z, Liu J, Chen L, Qiao M, Wang Z, Huang X, Deng Y (2011) Gene selection and classification for cancer microarray data based on machine learning and similarity measures. BMC Genom 12(S5):S1

    Article  Google Scholar 

  129. 129.

    Chen KH, Wang KJ, Wang KM, Angelia MA (2014) Applying particle swarm optimization-based decision tree classifier for cancer classification on gene expression data. Appl Soft Comput 24:773–780

    Article  Google Scholar 

  130. 130.

    Taiwan Cancer Registry, (2012), http://tcr.cph.ntu.edu.tw

  131. 131.

    Margoosian A, Abouei J (2013) Ensemble-based classifiers for cancer classification using human tumor microarray data. In: 2013 21st Iranian conference on electrical engineering (ICEE), IEEE, pp 1–6

  132. 132.

    Ramaswamy S et al (2002) Multiclass cancer diagnosis using tumor gene expression signatures. Proc Natl Acad Sci PNAS 98(26):15149–15154

    Article  Google Scholar 

  133. 133.

    Abdel-Zaher AM, Eldeib AM (2016) Breast cancer classification using deep belief networks. Expert Syst Appl 46:139–144

    Article  Google Scholar 

  134. 134.

    Dwivedi AK (2018) Artificial neural network model for effective cancer classification using microarray gene expression data. Neural Comput Appl 29(12):1545–1554

    Article  Google Scholar 

  135. 135.

    Sevakula RK, Singh V, Verma NK, Kumar C, Cui Y (2018) Transfer learning for molecular cancer classification using deep neural networks. IEEE ACM Trans Comput Biol Bioinf 16(6):2089–2100

    Article  Google Scholar 

  136. 136.

    Stiglic G, Kokol P (2010) Stability of ranked gene lists in large microarray analysis studies. J Biomed Biotechnol 2010:1–9

    Article  Google Scholar 

  137. 137.

    Ting FF, Tan YJ, Sim KS (2019) Convolutional neural network improvement for breast cancer classification. Expert Syst Appl 120:103–115

    Article  Google Scholar 

  138. 138.

    Mammographic Image Analysis Society (MIAS). (2018). http://www.mammoimage.org/databases/ Accessed: 25 January 2018

  139. 139.

    Ghoneim A, Muhammad G, Hossain MS (2020) Cervical cancer classification using convolutional neural networks and extreme learning machines. Future Gener Comput Syst 102:643–649

    Article  Google Scholar 

  140. 140.

    Yu L, Chen H, Dou Q, Qin J, Heng PA (2016) Automated melanoma recognition in dermoscopy images via very deep residual networks. IEEE Trans Med Imaging 36(4):994–1004

    Article  Google Scholar 

  141. 141.

    Gutman D, Codella NCF, Celebi E, Helba B, Marchetti M, Mishra N, Halpern A (2016) Skin lesion analysis toward melanoma detection: a challenge at the international symposium on biomedical imaging (ISBI) hosted by the International Skin Imaging Collaboration (ISIC), arXiv preprint arXiv:1605.01397

  142. 142.

    Albarqouni S, Baur C, Achilles F, Belagiannis V, Demirci S, Navab N (2016) Aggnet: deep learning from crowds for mitosis detection in breast cancer histology images. IEEE Trans Med Imaging 35(5):1313–1321

    Article  Google Scholar 

  143. 143.

    Von Ahn L (2006) Games with a purpose. Comput 39(6):92–94

    Article  Google Scholar 

  144. 144.

    Wang P, Wang L, Li Y, Song Q, Lv S, Hu X (2019) Automatic cell nuclei segmentation and classification of cervical Pap smear images. Biomed Signal Process Control 48:93–103

    Article  Google Scholar 

  145. 145.

    Zhang L, Lu L, Nogues I, Summers RM, Liu S, Yao J (2017) DeepPap: deep convolutional networks for cervical cell classification. IEEE J Biomed Health Inf 21(6):1633–1643

    Article  Google Scholar 

  146. 146.

    Kim Y, Zheng S, Tang J, Zheng WJ, Li Z, Jiang X. (2020) Anti-cancer Drug Synergy Prediction in Understudied Tissues using Transfer Learning. bioRxiv

  147. 147.

    Jiang P, Huang S, Fu Z, Sun Z, Lakowski TM, Hu P (2020) Deep graph embedding for prioritizing synergistic anticancer drug combinations. Comput Struct Biotechnol J 18:427–438

    Article  Google Scholar 

  148. 148.

    Ekşioğlu I, Tan M (2020) Prediction of Drug Synergy by Ensemble Learning. arXiv preprint arXiv:2001.01997

  149. 149.

    O’Neil J, Benita Y, Feldman I, Chenard M, Roberts B, Liu Y, Li J, Kral A, Lejnine S, Loboda A, Arthur W (2016) An unbiased oncology compound screen to identify novel combination strategies. Mol Cancer Ther 15(6):1155–1162

    Article  Google Scholar 

  150. 150.

    Zhang H, Feng J, Zeng A, Payne PR, Li F (2020) Predicting Tumor Cell Response to Synergistic Drug Combinations Using a Novel Simplified Deep Learning Model. bioRxiv

  151. 151.

    Kuru HI, Tastan O, Cicek AE (2020) MatchMaker: a deep learning framework for drug synergy prediction. bioRxiv

  152. 152.

    Preuer K, Lewis RP, Hochreiter S, Bender A, Bulusu KC, Klambauer G (2018) DeepSynergy: predicting anti-cancer drug synergy with Deep Learning. Bioinformatics 34(9):1538–1546

    Article  Google Scholar 

  153. 153.

    Wildenhain J, Spitzer M, Dolma S, Jarvik N, White R, Roy M, Griffiths E, Bellows DS, Wright GD, Tyers M (2015) Prediction of synergism from chemical-genetic interactions by machine learning. Cell Syst 1(6):383–395

    Article  Google Scholar 

  154. 154.

    Janizek JD, Celik S, Lee SI (2018) Explainable machine learning prediction of synergistic drug combinations for precision cancer medicine. bioRxiv, 1:331769

  155. 155.

    Mason DJ, Eastman RT, Lewis RP, Stott IP, Guha R, Bender A (2018) Using machine learning to predict synergistic antimalarial compound combinations with novel structures. Front Pharmacol 9:1096

    Article  Google Scholar 

  156. 156.

    Chen G, Tsoi A, Xu H, Zheng WJ (2018) Predict effective drug combination by deep belief network and ontology fingerprints. J Biomed Inf 85:149–154

    Article  Google Scholar 

  157. 157.

    Sharma A, Rani R (2018) An integrated framework for identification of effective and synergistic anti-cancer drug combinations. J Bioinf Comput Biol 16(05):1850017

    Article  Google Scholar 

  158. 158.

    Held MA, Langdon CG, Platt JT, Graham-Steed T, Liu Z, Chakraborty A, Bacchiocchi A, Koo A, Haskins JW, Bosenberg MW, Stern DF (2013) Genotype-selective combination therapies for melanoma identified by high throughput drug screening. Cancer Discov 3(1):52–67

    Article  Google Scholar 

  159. 159.

    Jang IS, Neto EC, Guinney J, Friend SH, Margolin AA (2014) Systematic assessment of analytical methods for drug sensitivity prediction from cancer cell line data. Biocomput 2014:63–74

    Google Scholar 

  160. 160.

    Menden MP, Iorio F, Garnett M, McDermott U, Benes CH, Ballester PJ, Saez-Rodriguez J (2013) Machine learning prediction of cancer cell sensitivity to drugs based on genomic and chemical properties. PLoS ONE 8(4):e61318

    Article  Google Scholar 

  161. 161.

    Turki T, Wei Z, Wang JT (2017) Transfer learning approaches to improve drug sensitivity prediction in multiple myeloma patients. IEEE Access 5:7381–7393

    Article  Google Scholar 

  162. 162.

    Wan Q, Pal R (2014) An ensemble based top performing approach for NCI-DREAM drug sensitivity prediction challenge. PLoS ONE 9(6):e101183

    Article  Google Scholar 

  163. 163.

    Dong Z, Zhang N, Li C, Wang H, Fang Y, Wang J, Zheng X (2015) Anticancer drug sensitivity prediction in cell lines from baseline gene expression through recursive feature selection. BMC Cancer 15(1):1–2

    Article  Google Scholar 

  164. 164.

    Rahman R, Matlock K, Ghosh S, Pal R (2017) Heterogeneity aware random forest for drug sensitivity prediction. Sci Rep 7(1):1–1

    Article  Google Scholar 

  165. 165.

    Yuan H, Paskov I, Paskov H, González AJ, Leslie CS (2016) Multitask learning improves prediction of cancer drug sensitivity. Sci Rep 6:31619

    Article  Google Scholar 

  166. 166.

    Ali M, Aittokallio T (2019) Machine learning and feature selection for drug response prediction in precision oncology applications. Biophys Rev 11(1):31–39

    Article  Google Scholar 

  167. 167.

    Haider S, Rahman R, Ghosh S, Pal R (2015) A copula based approach for design of multivariate random forests for drug sensitivity prediction. PLoS ONE 10(12):e0144490

    Article  Google Scholar 

  168. 168.

    He X, Folkman L, Borgwardt K (2018) Kernelized rank learning for personalized drug recommendation. Bioinformatics 34(16):2808–2816

    Article  Google Scholar 

  169. 169.

    Matlock K, De Niz C, Rahman R, Ghosh S, Pal R (2018) Investigation of model stacking for drug sensitivity prediction. BMC Bioinf 19(3):21–33

    Google Scholar 

  170. 170.

    Riddick G, Song H, Ahn S, Walling J, Borges-Rivera D, Zhang W, Fine HA (2011) Predicting in vitro drug sensitivity using Random Forests. Bioinformatics 27(2):220–224

    Article  Google Scholar 

  171. 171.

    Sharma A, Rani R (2019) Drug sensitivity prediction framework using ensemble and multi-task learning. Int J Mach Learn Cybern 11:1231–1240

    Article  Google Scholar 

  172. 172.

    Sharma A, Rani R (2019) Ensembled machine learning framework for drug sensitivity prediction. IET Syst Biol 14(1):39–46

    Article  Google Scholar 

  173. 173.

    Ezzat A, Wu M, Li XL, Kwoh CK (2017) Drug-target interaction prediction using ensemble learning and dimensionality reduction. Methods 129:81–88

    Article  Google Scholar 

  174. 174.

    Ezzat A, Wu M, Li XL, Kwoh CK (2016) Drug-target interaction prediction via class imbalance-aware ensemble learning. BMC Bioinf 17(19):267–276

    Google Scholar 

  175. 175.

    Tabei Y, Pauwels E, Stoven V, Takemoto K, Yamanishi Y (2012) Identification of chemogenomic features from drug–target interaction networks using interpretable classifiers. Bioinformatics 28(18):i487–i494

    Article  Google Scholar 

  176. 176.

    Chen R, Liu X, Jin S, Lin J, Liu J (2018) Machine learning for drug-target interaction prediction. Molecules 23(9):2208

    Article  Google Scholar 

  177. 177.

    Wen M, Zhang Z, Niu S, Sha H, Yang R, Yun Y, Lu H (2017) Deep-learning-based drug–target interaction prediction. J Proteome Res 16(4):1401–1409

    Article  Google Scholar 

  178. 178.

    Yuan Q, Gao J, Wu D, Zhang S, Mamitsuka H, Zhu S (2016) DrugE-Rank: improving drug–target interaction prediction of new candidate drugs or targets by ensemble learning to rank. Bioinformatics 32(12):i18-27

    Article  Google Scholar 

  179. 179.

    Law V, Knox C, Djoumbou Y, Jewison T. An Chi Guo, Yifeng Liu, Adam Maciejewski, David Arndt, Michael Wilson, Vanessa Neveu, and others (2014) DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res,42:D1

  180. 180.

    Zhang J, Zhu M, Chen P, Wang B (2017) Drugrpe: Random projection ensemble approach to drug-target interaction prediction. Neurocomp 228:256–262

    Article  Google Scholar 

  181. 181.

    He Z, Zhang J, Shi XH, Hu LL, Kong X, Cai YD, Chou KC (2010) Predicting drug-target interaction networks based on functional groups and biological features. PLoS ONE 5(3):e9603

    Article  Google Scholar 

  182. 182.

    Xie L, He S, Song X, Bo X, Zhang Z (2018) Deep learning-based transcriptome data classification for drug-target interaction prediction. BMC Genom 19(7):667

    Article  Google Scholar 

  183. 183.

    Tian K, Shao M, Wang Y, Guan J, Zhou S (2016) Boosting compound-protein interaction prediction by deep learning. Methods 110:64–72

    Article  Google Scholar 

  184. 184.

    Kuhn M, Szklarczyk D, Pletscher-Frankild S, Blicher TH, Von Mering C, Jensen LJ, Bork P (2014) STITCH 4: integration of protein–chemical interactions with user data. Nucleic Acids Res 42(D1):D401–D407

    Article  Google Scholar 

  185. 185.

    Wang J, Archambault B, Xu Y, Taleyarkhan RP (2010) Numerical simulation and experimental study on Resonant Acoustic Chambers—For novel, high-efficiency nuclear particle detectors. Nucl Eng Des 240(11):3716–3726

    Article  Google Scholar 

  186. 186.

    Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, Sonnhammer EL (2014) Pfam: the protein families database. Nucl Eng Des 42(D1):D222–D230

    Google Scholar 

  187. 187.

    Feng Q, Dueva E, Cherkasov A, Ester M (2018) Padme: a deep learning-based framework for drug-target interaction prediction. arXiv preprint arXiv:1807.09741

  188. 188.

    He T, Heidemeyer M, Ban F, Cherkasov A, Ester M (2017) Simboost: A readacross approach for predicting drug–target binding affinities using gradient boosting machines. J Cheminf 9(1):24

    Article  Google Scholar 

  189. 189.

    Davis MI, Hunt JP, Herrgard S, Ciceri P, Wodicka LM, Pallares G, Hocker M, Treiber DK, Zarrinkar PP (2011) Comprehensive analysis of kinase inhibitor selectivity. Nat Biotechnol 29(11):1046–1051

    Article  Google Scholar 

  190. 190.

    Metz JT, Johnson EF, Soni NB, Merta PJ, Kifle L, Hajduk PJ (2011) Navigating the kinome. Nat Chem Biol 7(4):200

    Article  Google Scholar 

  191. 191.

    Tang J, Szwajda A, Shakyawar S, Xu T, Hintsanen P, Wennerberg K, Aittokallio T (2014) Making sense of large-scale kinase inhibitor bioactivity data sets: A comparative and integrative analysis. J Chem Inf Model 54(3):735–743

    Article  Google Scholar 

  192. 192.

    Xie L, Zhang Z, He S, Bo X, Song X (2017) Drug—target interaction prediction with a deep-learning-based model. In: 2017 IEEE international conference on bioinformatics and biomedicine (BIBM), pp 469–476

  193. 193.

    Sharma A, Rani R (2018) BE-DTI’: Ensemble framework for drug target interaction prediction using dimensionality reduction and active learning. Comput Methods Programs Biomed 165:151–162

    Article  Google Scholar 

Download references



Author information



Corresponding author

Correspondence to Aman Sharma.

Ethics declarations

Conflict of interest


Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sharma, A., Rani, R. A Systematic Review of Applications of Machine Learning in Cancer Prediction and Diagnosis. Arch Computat Methods Eng (2021). https://doi.org/10.1007/s11831-021-09556-z

Download citation