Annals of Operations Research

, Volume 276, Issue 1–2, pp 5–34 | Cite as

Massive datasets and machine learning for computational biomedicine: trends and challenges

  • Anton KocheturovEmail author
  • Panos M. Pardalos
  • Athanasia Karakitsiou
S.I.: Computational Biomedicine


This survey paper attempts to cover a broad range of topics related to computational biomedicine. The field has been attracting great attention due to a number of benefits it can provide the society with. New technological and theoretical advances have made it possible to progress considerably. Traditionally, problems emerging in this field are challenging from many perspectives. In this paper, we considered the influence of big data on the field, problems associated with massive datasets in biomedicine and ways to address these problems. We analyzed the most commonly used machine learning and feature mining tools and several new trends and tendencies such as deep learning and biological networks for computational biomedicine.



Panos Pardalos was partially supported by Laboratory of Algorithm and Technologies for Network Analysis, Nizhny Novgorod, Russia.


  1. Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2(4), 433–459.Google Scholar
  2. Abeyratne, U. R., Tun, A. K., Lye, N. T., Guanglan, Z., & Saratchandran, P. (2000). RBF networks for source localization in quantitative electrophysiology. Critical Reviews in Biomedical Engineering, 28(3&4), 463–472.Google Scholar
  3. Acharya, U. R., Faust, O., Kadri, N. A., Suri, J. S., & Yu, W. (2013). Automated identification of normal and diabetes heart rate signals using nonlinear measures. Computers in Biology and Medicine, 43(10), 1523–1529.Google Scholar
  4. Acharya, U. R., Sree, S. V., Ang, P. C. A., Yanti, R., & Suri, J. S. (2012). Application of non-linear and wavelet based features for the automated identification of epileptic EEG signals. International Journal of Neural Systems, 22(02), 1250002.Google Scholar
  5. Aizerman, M. A., Braverman, E. M., & Rozonoer, L. I. (1964). Theoretical foundations of potential function method in pattern recognition. Automation and Remote Control, 25, 917–936.Google Scholar
  6. Albarqouni, S., Baur, C., Achilles, F., Belagiannis, V., Demirci, S., & Navab, N. (2016). Aggnet: Deep learning from crowds for mitosis detection in breast cancer histology images. IEEE Transactions on Medical Imaging, 35(5), 1313–1321.Google Scholar
  7. Albert, R., Jeong, H., & Barabási, A.-L. (1999). Internet: Diameter of the world-wide web. Nature, 401(6749), 130.Google Scholar
  8. Almeida, L. B. (2003). Misep-linear and nonlinear ica based on mutual information. Journal of Machine Learning Research, 4, 1297–1318.Google Scholar
  9. Azevedo, F. A. C., Carvalho, L. R. B., Grinberg, L. T., Farfel, J. M., Ferretti, R. E. L., Leite, R. E. P., et al. (2009). Equal numbers of neuronal and nonneuronal cells make the human brain an isometrically scaled-up primate brain. Journal of Comparative Neurology, 513(5), 532–541.Google Scholar
  10. Balasubramanian, M., & Schwartz, E. L. (2002). The isomap algorithm and topological stability. Science, 295(5552), 7–7.Google Scholar
  11. Baldi, P. (2012). Autoencoders, unsupervised learning, and deep architectures. In Proceedings of ICML workshop on unsupervised and transfer learning (pp. 37–49).Google Scholar
  12. Barabási, A.-L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512.Google Scholar
  13. Barua, S., Islam, M. M., Yao, X., & Murase, K. (2014). Mwmote-majority weighted minority oversampling technique for imbalanced data set learning. IEEE Transactions on Knowledge and Data Engineering, 26(2), 405–425.Google Scholar
  14. Batal, I., Cooper, G. F., Fradkin, D., Harrison, J., Moerchen, F., & Hauskrecht, M. (2016). An efficient pattern mining approach for event detection in multivariate temporal data. Knowledge and Information Systems, 46(1), 115–150.Google Scholar
  15. Bock, D. D., Lee, W.-C. A., Kerlin, A. M., Andermann, M. L., Hood, G., Wetzel, A. W., et al. (2011). Network anatomy and in vivo physiology of visual cortical neurons. Nature, 471(7337), 177–182.Google Scholar
  16. Boginski, V., & Commander, C. W. (2009). Identifying critical nodes in protein–protein interaction networks. In Clustering challenges in biological networks (pp. 153–167). World Scientific.Google Scholar
  17. Borghini, G., Astolfi, L., Vecchiato, G., Mattia, D., & Babiloni, F. (2014). Measuring neurophysiological signals in aircraft pilots and car drivers for the assessment of mental workload, fatigue and drowsiness. Neuroscience & Biobehavioral Reviews, 44, 58–75.Google Scholar
  18. Boser, B. E., Guyon, I. M., Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the fifth annual workshop on computational learning theory (pp. 144–152). ACM.Google Scholar
  19. Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.Google Scholar
  20. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.Google Scholar
  21. Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. Boca Raton: CRC press.Google Scholar
  22. Brosch, T., Tang, L. Y. W., Yoo, Y., Li, D. K. B., Traboulsee, A., & Tam, R. (2016). Deep 3d convolutional encoder networks with shortcuts for multiscale feature integration applied to multiple sclerosis lesion segmentation. IEEE Transactions on Medical Imaging, 35(5), 1229–1239.Google Scholar
  23. Butenko, S., Chaovalitwongse, W. A., & Pardalos, P. M. (2009). Clustering challenges in biological networks. Singapore: World Scientific.Google Scholar
  24. Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., et al. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14(5), 365.Google Scholar
  25. Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM Computing Surveys (CSUR), 41(3), 15.Google Scholar
  26. Chan, H.-P., Lo, S.-C. B., Sahiner, B., Lam, K. L., & Helvie, M. A. (1995). Computer-aided detection of mammographic microcalcifications: Pattern recognition with an artificial neural network. Medical Physics, 22(10), 1555–1567.Google Scholar
  27. Chang, H.-H., & Moura, J. M. F. (2010). Biomedical signal processing. Biomedical Engineering and Design Handbook, 2, 559–579.Google Scholar
  28. Chang, R. L., Ghamsari, L., Manichaikul, A., Hom, E. F. Y., Balaji, S., Weiqi, F., et al. (2011). Metabolic network reconstruction of chlamydomonas offers insight into light-driven algal metabolism. Molecular Systems Biology, 7(1), 518.Google Scholar
  29. Chang, Y. D. C., Ido, M. S., & Long, Q. (2016). Multiple imputation for general missing data patterns in the presence of high-dimensional data. Scientific Reports, 6, 21689.Google Scholar
  30. Chaovalitwongse, W. A., & Pardalos, P. M. (2008). On the time series support vector machine using dynamic time warping kernel for brain activity classification. Cybernetics and Systems Analysis, 44(1), 125–138.Google Scholar
  31. Charles, D., Gabriel, M., & Furukawa, M. F. (2013). Adoption of electronic health record systems among us non-federal acute care hospitals: 2008–2012. ONC Data Brief, 9, 1–9.Google Scholar
  32. Chawla, M. P. S. (2011). Pca and ica processing methods for removal of artifacts and noise in electrocardiograms: A survey and comparison. Applied Soft Computing, 11(2), 2216–2226.Google Scholar
  33. Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). Smote: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.Google Scholar
  34. Chou, K.-C., & Shen, H.-B. (2007). Recent progress in protein subcellular location prediction. Analytical Biochemistry, 370(1), 1–16.Google Scholar
  35. CireşAn, D., Meier, U., Masci, J., & Schmidhuber, J. (2012). Multi-column deep neural network for traffic sign classification. Neural Networks, 32, 333–338.Google Scholar
  36. Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.Google Scholar
  37. Crookston, N. L., Finley, A. O., et al. (2008). yaimpute: An R package for kNN imputation. Journal of Statistical Software, 23(10), 1–16.Google Scholar
  38. Csermely, P., Korcsmáros, T., Kiss, H. J. M., London, G., & Nussinov, R. (2013). Structure and dynamics of molecular networks: A novel paradigm of drug discovery: A comprehensive review. Pharmacology & Therapeutics, 138(3), 333–408.Google Scholar
  39. de Rooij, M., Crienen, S., Witjes, J. A., Barentsz, J. O., Rovers, M. M., & Grutters, J. P. C. (2014). Cost-effectiveness of magnetic resonance (mr) imaging and mr-guided targeted biopsy versus systematic transrectal ultrasound-guided biopsy in diagnosing prostate cancer: A modelling study from a health care perspective. European Urology, 66(3), 430–436.Google Scholar
  40. De Solla Price, D. J. (1965). Networks of scientific papers. Science, 149, 510–515.Google Scholar
  41. Dehzangi, A., Paliwal, K., Sharma, A., Dehzangi, O., & Sattar, A. (2013). A combination of feature extraction methods with an ensemble of different classifiers for protein structural class prediction problem. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 10(3), 564–575.Google Scholar
  42. Delorme, A., Sejnowski, T., & Makeig, S. (2007). Enhanced detection of artifacts in EEG data using higher-order statistics and independent component analysis. Neuroimage, 34(4), 1443–1449.Google Scholar
  43. Donoho, D. L., & Grimes, C. (2003). Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proceedings of the National Academy of Sciences, 100(10), 5591–5596.Google Scholar
  44. Drummond, C., Holte, R. C., et al. (2003). C4.5, class imbalance, and cost sensitivity: Why under-sampling beats over-sampling. In Workshop on learning from imbalanced datasets II (Vol. 11, pp. 1–8). Citeseer.Google Scholar
  45. Duarte, N. C., Becker, S. A., Jamshidi, N., Thiele, I., Mo, M. L., Vo, T. D., et al. (2007). Global reconstruction of the human metabolic network based on genomic and bibliomic data. Proceedings of the National Academy of Sciences, 104(6), 1777–1782.Google Scholar
  46. Efron, B., Hastie, T., Johnstone, I., Tibshirani, R., et al. (2004). Least angle regression. The Annals of Statistics, 32(2), 407–499.Google Scholar
  47. Eguiluz, V. M., Chialvo, D. R., Cecchi, G. A., Baliki, M., & Apkarian, A. V. (2005). Scale-free brain functional networks. Physical Review Letters, 94(1), 018102.Google Scholar
  48. Eisenstein, M. (2015). Big data: The power of petabytes. Nature, 527(7576), S2–S4.Google Scholar
  49. Elbuni, A., Kanoun, S., Elbuni, M., & Ali, N. (2009). ECG parameter extraction algorithm using (dwtae) algorithm. In International conference on computer engineering & systems, 2009. ICCES 2009 (pp. 315–320). IEEE.Google Scholar
  50. Elkan, C. (2001). The foundations of cost-sensitive learning. In International joint conference on artificial intelligence (Vol. 17, pp. 973–978). Lawrence Erlbaum Associates Ltd.Google Scholar
  51. Enders, C. K. (2010). Applied missing data analysis. Guilford Press.Google Scholar
  52. Fan, W., Stolfo, S. J., Zhang, J., & Chan, P. K. (1999). Adacost: Misclassification cost-sensitive boosting. In Icml (Vol. 99, pp. 97–105).Google Scholar
  53. Faust, O., Acharya, U. R., Adeli, H., & Adeli, A. (2015). Wavelet-based EEG processing for computer-aided seizure detection and epilepsy diagnosis. Seizure-European Journal of Epilepsy, 26, 56–64.Google Scholar
  54. Ferrari, M., & Quaresima, V. (2012). A brief review on the history of human functional near-infrared spectroscopy (fnirs) development and fields of application. Neuroimage, 63(2), 921–935.Google Scholar
  55. Freeman, L. (1977). A set of measures of centrality based on betweenness. Sociometry, 40(1), 35–41. Scholar
  56. Freund, Y., Schapire, R. E., et al. (1996). Experiments with a new boosting algorithm. In Icml (Vol. 96, pp. 148–156). Bari, Italy.Google Scholar
  57. Friedman, J. H. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19, 1–67.Google Scholar
  58. Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics & Data Analysis, 38(4), 367–378.Google Scholar
  59. Furnival, G. M., & Wilson, R. W. (1974). Regressions by leaps and bounds. Technometrics, 16(4), 499–511.Google Scholar
  60. Gao, Z.-K., Cai, Q., Yang, Y.-X., Dang, W.-D., & Zhang, S.-S. (2016). Multiscale limited penetrable horizontal visibility graph for analyzing nonlinear time series. Scientific Reports, 6, 35622.Google Scholar
  61. Gardner, A. B., Worrell, G. A., Marsh, E., Dlugos, D., & Litt, B. (2007). Human and automated detection of high-frequency oscillations in clinical intracranial EEG recordings. Clinical Neurophysiology, 118(5), 1134–1143.Google Scholar
  62. Gilchrist, J., Ennett, C.M., Frize, M., & Bariciak, E. (2011). Neonatal mortality prediction using real-time medical measurements. In 2011 IEEE international workshop on medical measurements and applications proceedings (MeMeA) (pp. 65–70). IEEE.Google Scholar
  63. Glasser, M. F., Coalson, T. S., Robinson, E. C., Hacker, C. D., Harwell, J., Yacoub, E., et al. (2016). A multi-modal parcellation of human cerebral cortex. Nature, 536(7615), 171–178.Google Scholar
  64. Goel, S., Tomar, P., & Kaur, G. (2016). An optimal wavelet approach for ECG noise cancellation. International Journal of Bio-Science and Bio-Technology, 8(4), 39–52.Google Scholar
  65. Gong, G., He, Y., Concha, L., Lebel, C., Gross, D. W., Evans, A. C., et al. (2008). Mapping anatomical connectivity patterns of human cerebral cortex using in vivo diffusion tensor imaging tractography. Cerebral Cortex, 19(3), 524–536.Google Scholar
  66. Gorber, S. C., Tremblay, M., Moher, D., & Gorber, B. (2007). A comparison of direct vs. self-report measures for assessing height, weight and body mass index: A systematic review. Obesity Reviews, 8(4), 307–326.Google Scholar
  67. Graves, A., Mohamed, A., & Hinton, G. (2013). Speech recognition with deep recurrent neural networks. In 2013 IEEE international conference on acoustics, speech and signal processing (icassp) (pp. 6645–6649). IEEE.Google Scholar
  68. Grech, R., Cassar, T., Muscat, J., Camilleri, K. P., Fabri, S. G., Zervakis, M., et al. (2008). Review on solving the inverse problem in eeg source analysis. Journal of Neuroengineering and Rehabilitation, 5(1), 25.Google Scholar
  69. Green, W. J. F., Ball, G., Hulman, G., Johnson, C., Van Schalwyk, G., Ratan, H. L., et al. (2016). KI67 and DLX2 predict increased risk of metastasis formation in prostate cancer-a targeted molecular approach. British Journal of Cancer, 115(2), 236.Google Scholar
  70. Greenspan, H., van Ginneken, B., & Summers, R. M. (2016). Guest editorial deep learning in medical imaging: Overview and future promise of an exciting new technique. IEEE Transactions on Medical Imaging, 35(5), 1153–1159.Google Scholar
  71. Grossi, E., Veggo, F., Narzisi, A., Compare, A., & Muratori, F. (2016). Pregnancy risk factors in autism: A pilot study with artificial neural networks. Pediatric Research, 79(2), 339.Google Scholar
  72. Guo, H., & Viktor, H. L. (2004). Learning from imbalanced data sets with boosting and data generation: The databoost-im approach. ACM Sigkdd Explorations Newsletter, 6(1), 30–39.Google Scholar
  73. Hajian-Tilaki, K. (2013). Receiver operating characteristic (roc) curve analysis for medical diagnostic test evaluation. Caspian Journal of Internal Medicine, 4(2), 627.Google Scholar
  74. Halford, J. J., Sabau, D., Drislane, F. W., Tsuchida, T. N., & Sinha, S. R. (2016). American clinical neurophysiology society guideline 4: Recording clinical eeg on digital media. The Neurodiagnostic Journal, 56(4), 261–265.Google Scholar
  75. Han, H., Wang, W.-Y., & Mao, B.-H. (2005). Borderline-smote: A new over-sampling method in imbalanced data sets learning. In International conference on intelligent computing (pp. 878–887). Springer.Google Scholar
  76. Harrison, R. R., Kier, R. J., Chestek, C. A., Gilja, V., Nuyujukian, P., Ryu, S., et al. (2009). Wireless neural recording with single low-power integrated circuit. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 17(4), 322–329.Google Scholar
  77. He, H., Bai, Y., Garcia, E. A., & Li, S. (2008). Adasyn: Adaptive synthetic sampling approach for imbalanced learning. In IEEE international joint conference on neural networks, 2008. IJCNN 2008 (IEEE world congress on computational intelligence) (pp. 1322–1328). IEEE.Google Scholar
  78. He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284.Google Scholar
  79. Helmstaedter, M. (2013). Cellular-resolution connectomics: Challenges of dense neural circuit reconstruction. Nature Methods, 10(6), 501.Google Scholar
  80. Hess, K. R., Keith Anderson, W., Symmans, F., Valero, V., Ibrahim, N., Mejia, J. A., et al. (2006). Pharmacogenomic predictor of sensitivity to preoperative chemotherapy with paclitaxel and fluorouracil, doxorubicin, and cyclophosphamide in breast cancer. Journal of Clinical Oncology, 24(26), 4236–4244.Google Scholar
  81. Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55–67.Google Scholar
  82. Hoffmann, A., Huang, Y., Suetsugu-Maki, R., Ringelberg, C. S., Tomlinson, C. R., Rio-Tsonis, K. D., et al. (2012). Implication of the mir-184 and mir-204 competitive rna network in control of mouse secondary cataract. Molecular Medicine, 18(1), 528.Google Scholar
  83. Hormozdiari, F., Penn, O., Borenstein, E., & Eichler, E. E. (2015). The discovery of integrated gene networks for autism and related disorders. Genome Research, 25(1), 142–154.Google Scholar
  84. Huang, P.-S., Boyken, S. E., & Baker, D. (2016). The coming of age of de novo protein design. Nature, 537(7620), 320–327.Google Scholar
  85. Hughes, C., Henderson, A., Kansiz, M., Dorling, K. M., Jimenez-Hernandez, M., Brown, Michael D., et al. (2015). Enhanced ftir bench-top imaging of single biological cells. Analyst, 140(7), 2080–2085.Google Scholar
  86. Hyvärinen, A., Karhunen, J., & Oja, E. (2004). Independent component analysis (Vol. 46). Wiley.Google Scholar
  87. Hyvärinen, A., & Pajunen, P. (1999). Nonlinear independent component analysis: Existence and uniqueness results. Neural Networks, 12(3), 429–439.Google Scholar
  88. Iasemidis, L. D., Shiau, D.-S., Pardalos, P. M., Chaovalitwongse, W., Narayanan, K., Prasad, A., et al. (2005). Long-term prospective on-line real-time seizure prediction. Clinical Neurophysiology, 116(3), 532–544.Google Scholar
  89. Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2(8), e124.Google Scholar
  90. Jeong, H., Mason, S. P., Barabási, A.-L., & Oltvai, Z. N. (2001). Lethality and centrality in protein networks. Nature, 411(6833), 41.Google Scholar
  91. Jeong, H., Tombor, B., Albert, R., Oltvai, Z. N., & Barabási, A.-L. (2000). The large-scale organization of metabolic networks. Nature, 407(6804), 651.Google Scholar
  92. Jia, J., Liu, Z., Xiao, X., Liu, B., & Chou, K.-C. (2015). ippi-esml: An ensemble classifier for identifying the interactions of proteins by incorporating their physicochemical properties and wavelet transforms into pseaac. Journal of Theoretical Biology, 377, 47–56.Google Scholar
  93. Jia, Y., Wei, E., Wang, X., Zhang, X., Morrison, J. C., Parikh, M., et al. (2014). Optical coherence tomography angiography of optic disc perfusion in glaucoma. Ophthalmology, 121(7), 1322–1332.Google Scholar
  94. Johnson, A. E. W., Pollard, T. J., Shen, L., Li-wei, H. L., Feng, M., Ghassemi, M., et al. (2016). Mimic-III, a freely accessible critical care database. Scientific Data, 3, 160035.Google Scholar
  95. Johnsson, P., Ackley, A., Vidarsdottir, L., Lui, W.-O., Corcoran, M., Grandér, D., et al. (2013). A pseudogene long-noncoding-rna network regulates pten transcription and translation in human cells. Nature Structural and Molecular Biology, 20(4), 440.Google Scholar
  96. Jombart, T., Devillard, S., & Balloux, F. (2010). Discriminant analysis of principal components: A new method for the analysis of genetically structured populations. BMC Genetics, 11(1), 94.Google Scholar
  97. Kabir, M. A., & Shahnaz, C. (2012). Denoising of ECG signals based on noise reduction algorithms in EMD and wavelet domains. Biomedical Signal Processing and Control, 7(5), 481–489.Google Scholar
  98. Kasthuri, N., Hayworth, K. J., Berger, D. R., Schalek, R. L., Conchello, J. A., Knowles-Barley, S., et al. (2015). Saturated reconstruction of a volume of neocortex. Cell, 162(3), 648–661.Google Scholar
  99. Khaligh-Razavi, S.-M., & Kriegeskorte, N. (2014). Deep supervised, but not unsupervised, models may explain it cortical representation. PLoS Computational Biology, 10(11), e1003915.Google Scholar
  100. Khalilia, M., Chakraborty, S., & Popescu, M. (2011). Predicting disease risks from highly imbalanced data using random forest. BMC Medical Informatics and Decision Making, 11(1), 51.Google Scholar
  101. Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial Intelligence, 97(1–2), 273–324.Google Scholar
  102. Kohonen, T. (1998). The self-organizing map. Neurocomputing, 21(1–3), 1–6.Google Scholar
  103. Korenkevych, D., Chien, J.-H., Zhang, J., Shiau, D.-S., Sackellares, C., & Pardalos, P. M. (2013). Small world networks in computational neuroscience. In Handbook of combinatorial optimization (pp. 3057–3088). Springer.Google Scholar
  104. Korenkevych, D., Ozrazgat-Baslanti, T., Thottakkara, P., Hobson, C. E., Pardalos, P., Momcilovic, P., et al. (2016). The pattern of longitudinal change in serum creatinine and ninety-day mortality after major surgery. Annals of Surgery, 263(6), 1219.Google Scholar
  105. Kotsiantis, S., Kanellopoulos, D., Pintelas, P., et al. (2006). Handling imbalanced datasets: A review. GESTS International Transactions on Computer Science and Engineering, 30(1), 25–36.Google Scholar
  106. Kubat, M., & Matwin, S. (1997). Addressing the curse of imbalanced data sets: One sided sampling. In Proceedings of the fourteenth international conference on machine learning (pp. 179–186).Google Scholar
  107. Latora, V., & Marchiori, M. (2003). Economic small-world behavior in weighted networks. The European Physical Journal B-Condensed Matter and Complex Systems, 32(2), 249–263.Google Scholar
  108. Lee, D.-S., Park, J., Kay, K. A., Christakis, N. A., Oltvai, Z. N., & Barabási, A.-L. (2008). The implications of human metabolic network topology for disease comorbidity. Proceedings of the National Academy of Sciences, 105(29), 9880–9885.Google Scholar
  109. Ling, C. X., & Li, C. (1998). Data mining for direct marketing: Problems and solutions. In KDD (Vol. 98, pp. 73–79).Google Scholar
  110. Ling, C. X., & Sheng, V. S. (2011). Cost-sensitive learning. In Encyclopedia of machine learning (pp. 231–235). Springer.Google Scholar
  111. Ling, C. X., Yang, Q., Wang, J., & Zhang, S. (2004). Decision trees with minimal costs. In Proceedings of the twenty-first international conference on Machine learning (p.  69). ACM.Google Scholar
  112. Liu, B., Wei, Y., Zhang, Y., & Yang, Q. (2017). Deep neural networks for high dimension, low sample size data. In Proceedings of the twenty-sixth international joint conference on artificial intelligence, IJCAI-17 (pp. 2287–2293).Google Scholar
  113. Liu, W., Liu, C., Chen, F., Yang, J., & Zheng, L. (2016). Discrimination of transgenic soybean seeds by terahertz spectroscopy. Scientific Reports, 6, 35799.Google Scholar
  114. Liu, X.-Y., Wu, J., & Zhou, Z.-H. (2009). Exploratory undersampling for class-imbalance learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 39(2), 539–550.Google Scholar
  115. Liu, X.-Y., & Zhou, Z.-H. (2006). The influence of class imbalance on cost-sensitive learning: An empirical study. In Sixth international conference on data mining, 2006. ICDM’06 (pp. 970–974). IEEE.Google Scholar
  116. Lorente, D., Aleixos, N., Gómez-Sanchis, J., Cubero, S., García-Navarrete, Or L., & Blasco, J. (2012). Recent advances and applications of hyperspectral imaging for fruit and vegetable quality assessment. Food and Bioprocess Technology, 5(4), 1121–1142.Google Scholar
  117. Lowery, A. J., Miller, N., Devaney, A., McNeill, R. E., Davoren, P. A., Lemetre, C., et al. (2009). Microrna signatures predict oestrogen receptor, progesterone receptor and her2/neu receptor status in breast cancer. Breast Cancer Research, 11(3), R27.Google Scholar
  118. Luo, J., Min, W., Gopukumar, D., & Zhao, Y. (2016). Big data application in biomedical research and health care: A literature review. Biomedical Informatics Insights, 8, 1.Google Scholar
  119. Mangasarian, O. L., & Wild, E. W. (2006). Multisurface proximal support vector machine classification via generalized eigenvalues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(1), 69–74.Google Scholar
  120. Mani, I., & Zhang, I. (2003). kNN approach to unbalanced data distributions: A case study involving information extraction. In Proceedings of workshop on learning from imbalanced datasets (Vol. 126).Google Scholar
  121. Manjón, J. V., Coupé, P., & Buades, A. (2015). Mri noise estimation and denoising using non-local pca. Medical Image Analysis, 22(1), 35–47.Google Scholar
  122. Mardis, E. R. (2011). A decades perspective on DNA sequencing technology. Nature, 470(7333), 198.Google Scholar
  123. Martis, R. J., Acharya, U. R., Lim, C. M., Mandana, K. M., Ray, A. K., & Chakraborty, C. (2013). Application of higher order cumulant features for cardiac health diagnosis using ECG signals. International Journal of Neural Systems, 23(04), 1350014.Google Scholar
  124. McCarthy, K., Zabar, B., & Weiss, G. (2005). Does cost-sensitive learning beat sampling for classifying rare classes? In Proceedings of the 1st international workshop on Utility-based data mining (pp. 69–77). ACM.Google Scholar
  125. Mika, S., Ratsch, G., Weston, J., Scholkopf, B., & Mullers, K.-R. (1999). Fisher discriminant analysis with kernels. In Neural networks for signal processing IX, 1999. Proceedings of the 1999 IEEE signal processing society workshop (pp. 41–48). IEEE.Google Scholar
  126. Mikula, S. (2016). Progress towards mammalian whole-brain cellular connectomics. Frontiers in Neuroanatomy, 10, 62.Google Scholar
  127. Ming, L., Zhang, Q., Deng, M., Miao, J., Guo, Y., Gao, W., et al. (2008). An analysis of human microrna and disease associations. PloS ONE, 3(10), e3420.Google Scholar
  128. Miranda, H., Gilja, V., Chestek, C. A., Shenoy, K. V., & Meng, T. H. (2010). Hermesd: A high-rate long-range wireless transmission system for simultaneous multichannel neural recording applications. IEEE Transactions on Biomedical Circuits and Systems, 4(3), 181–191.Google Scholar
  129. Moore, G. E., et al. (1975). Progress in digital integrated electronics. Electron Devices Meeting, 21, 11–13.Google Scholar
  130. Murray, C. J. L., Lozano, R., Flaxman, A. D., Serina, P., Phillips, D., Stewart, A., et al. (2014). Using verbal autopsy to measure causes of death: The comparative performance of existing methods. BMC Medicine, 12(1), 5.Google Scholar
  131. Naimi, H., Adamou-Mitiche, A. B. H., & Mitiche, L. (2015). Medical image denoising using dual tree complex thresholding wavelet transform and wiener filter. Journal of King Saud University-Computer and Information Sciences, 27(1), 40–45.Google Scholar
  132. Naseer, N., Hong, M. J., & Hong, K.-S. (2014). Online binary decision decoding using functional near-infrared spectroscopy for the development of brain-computer interface. Experimental Brain Research, 232(2), 555–564.Google Scholar
  133. Newman, M. E. J. (2012). Communities, modules and large-scale structure in networks. Nature Physics, 8(1), 25.Google Scholar
  134. Newman, M. E. J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69(2), 026113.Google Scholar
  135. Ng, M., Fleming, T., Robinson, M., Thomson, B., Graetz, N., Margono, C., et al. (2014). Global, regional, and national prevalence of overweight and obesity in children and adults during 1980–2013: A systematic analysis for the global burden of disease study 2013. The Lancet, 384(9945), 766–781.Google Scholar
  136. Nguyen, T. B., Wang, S., Anugu, V., Rose, N., McKenna, M., Petrick, N., et al. (2012). Distributed human intelligence for colonic polyp classification in computer-aided detection for CT colonography. Radiology, 262(3), 824–833.Google Scholar
  137. Niedermeyer, E., & da Silva, F. L. (Eds.). (2005). Electroencephalography: Basic principles, clinical applications, and related fields. Lippincott Williams & Wilkins.Google Scholar
  138. Nunez, P. L., & Pilgreen, K. L. (1991). The spline-laplacian in clinical neurophysiology: A method to improve EEG spatial resolution. Journal of Clinical Neurophysiology: Official Publication of the American Electroencephalographic Society, 8(4), 397–413.Google Scholar
  139. Oberhardt, M. A., Palsson, B. Ø., & Papin, J. A. (2009). Applications of genome-scale metabolic reconstructions. Molecular Systems Biology, 5(1), 320.Google Scholar
  140. Oh, S., Lee, M. S., & Zhang, B.-T. (2011). Ensemble learning with active example selection for imbalanced biomedical data classification. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 8(2), 316–325.Google Scholar
  141. Orth, J. D., Conrad, T. M., Na, J., Lerman, J. A., Nam, H., Feist, A. M., et al. (2011). A comprehensive genome-scale reconstruction of escherichia coli metabolism2011. Molecular Systems Biology, 7(1), 535.Google Scholar
  142. Pappu, V., Panagopoulos, O. P., Xanthopoulos, P., & Pardalos, P. M. (2015). Sparse proximal support vector machines for feature selection in high dimensional datasets. Expert Systems with Applications, 42(23), 9183–9191.Google Scholar
  143. Pardalos, P. M., Chaovalitwongse, W., Iasemidis, L. D., Sackellares, J. C., Shiau, D.-S., Carney, P. R., et al. (2004). Seizure warning algorithm based on optimization and nonlinear dynamics. Mathematical Programming, 101(2), 365–385.Google Scholar
  144. Park, Y. S., Choi, Y. H., Lee, H. S., Moon, D. J., Kim, S. G., Lee, J. H., et al. (2013). The impact of laser doppler imaging on the early decision-making process for surgical intervention in adults with indeterminate burns. Burns, 39(4), 655–661.Google Scholar
  145. Peng, Y., Jiang, Y., Yang, C., Brown, J. B., Antic, T., Sethi, I., et al. (2013). Quantitative analysis of multiparametric prostate mr images: Differentiation between prostate cancer and normal tissue and correlation with gleason scorea computer-aided diagnosis development study. Radiology, 267(3), 787–796.Google Scholar
  146. Picard, D. (1985). Testing and estimating change-points in time series. Advances in Applied Probability, 17(4), 841–867.Google Scholar
  147. Quinlan, J. R. (1993). Combining instance-based and model-based learning. In Proceedings of the tenth international conference on machine learning (pp. 236–243).Google Scholar
  148. Quinlan, J. R, et al. (1992). Learning with continuous classes. In 5th Australian joint conference on artificial intelligence (Vol. 92, pp. 343–348). Singapore.Google Scholar
  149. Raghunathan, T., & Siscovick, D. (1996). A multiple-imputation analysis of a case-control study of the risk of primary cardiac arrest among pharmacologically treated hypertensives. Journal of the Royal Statistical Society. Series C (Applied Statistics), 45, 335–352.Google Scholar
  150. Ramgopal, S., Thome-Souza, S., Jackson, M., Kadish, N. E., Fernández, I. S., Klehm, J., et al. (2014). Seizure detection, seizure prediction, and closed-loop warning systems in epilepsy. Epilepsy & behavior, 37, 291–307.Google Scholar
  151. Robb, R. A. (1999). Biomedical imaging, visualization, and analysis. Wiley.Google Scholar
  152. Rokach, L. (2010). Ensemble-based classifiers. Artificial Intelligence Review, 33(1–2), 1–39.Google Scholar
  153. Romero, I. (2011). PCA and ICA applied to noise reduction in multi-lead ECG. In Computing in cardiology, 2011 (pp. 613–616). IEEE.Google Scholar
  154. Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323–2326.Google Scholar
  155. Rubin, D. B. (2004). Multiple imputation for nonresponse in surveys (Vol. 81). Wiley.Google Scholar
  156. Saeys, Y., Inza, I., & Larrañaga, P. (2007). A review of feature selection techniques in bioinformatics. Bioinformatics, 23(19), 2507–2517.Google Scholar
  157. Salam, M. T., Sawan, M., & Nguyen, D. K. (2011). A novel low-power-implantable epileptic seizure-onset detector. IEEE Transactions on Biomedical Circuits and Systems, 5(6), 568–578.Google Scholar
  158. Salathé, M., Kazandjieva, M., Lee, J. W., Levis, P., Feldman, M. W., & Jones, J. H. (2010). A high-resolution human contact network for infectious disease transmission. Proceedings of the National Academy of Sciences, 107(51), 22020–22025.Google Scholar
  159. Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural Networks, 61, 85–117.Google Scholar
  160. Scholz, M., Kaplan, F., Guy, C. L., Kopka, J., & Selbig, J. (2005). Non-linear PCA: A missing data approach. Bioinformatics, 21(20), 3887–3895.Google Scholar
  161. Shaw, L. J., Raggi, P., Berman, D. S., & Callister, T. Q. (2006). Coronary artery calcium as a measure of biologic age. Atherosclerosis, 188(1), 112–119.Google Scholar
  162. Shin, H.-C., Roth, H. R., Gao, M., Lu, L., Xu, Z., Nogues, I., et al. (2016). Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Transactions on Medical Imaging, 35(5), 1285–1298.Google Scholar
  163. Shivaswamy, P. K., Bhattacharyya, C., & Smola, A. J. (2006). Second order cone programming approaches for handling missing and uncertain data. Journal of Machine Learning Research, 7, 1283–1314.Google Scholar
  164. Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., et al. (2016). Mastering the game of go with deep neural networks and tree search. Nature, 529(7587), 484–489.Google Scholar
  165. Sinha, S. R., Sullivan, L. R., Sabau, D., Orta, D. S. J., Dombrowski, K. E., Halford, J. J., et al. (2016). American clinical neurophysiology society guideline 1: Minimum technical requirements for performing clinical electroencephalography. The Neurodiagnostic Journal, 56(4), 235–244.Google Scholar
  166. Skidmore, F., Korenkevych, D., Liu, Y., He, G., Bullmore, E., & Pardalos, P. M. (2011). Connectivity brain networks based on wavelet correlation analysis in parkinson fmri data. Neuroscience Letters, 499(1), 47–51.Google Scholar
  167. Sosenko, J. M., Mahon, J., Rafkin, L., Lachin, J. M., Krause-Steinrauf, H., Krischer, J. P., et al. (2011). A comparison of the baseline metabolic profiles between diabetes prevention trial-type 1 and trialnet natural history study participants. Pediatric Diabetes, 12(2), 85–90.Google Scholar
  168. Sporns, O., Honey, C. J., & Kötter, R. (2007). Identification and classification of hubs in brain networks. PloS ONE, 2(10), e1049.Google Scholar
  169. Sporns, O., Tononi, G., & Edelman, G. M. (2000). Theoretical neuroanatomy: Relating anatomical and functional connectivity in graphs and cortical connection matrices. Cerebral Cortex, 10(2), 127–141.Google Scholar
  170. Statnikov, A. (2011). A gentle introduction to support vector machines in biomedicine: Theory and methods (Vol. 1). World Scientific.Google Scholar
  171. Szklarczyk, D., Franceschini, A., Wyder, S., Forslund, K., Heller, D., Huerta-Cepas, J., et al. (2014). String v10: Protein–protein interaction networks, integrated over the tree of life. Nucleic Acids Research, 43(D1), D447–D452.Google Scholar
  172. Tan, M., Wang, L., & Tsang, I. W. (2010). Learning sparse svm for feature selection on very high dimensional datasets. In Proceedings of the 27th international conference on machine learning (ICML-10) (pp. 1047–1054).Google Scholar
  173. Tang, G., & Qin, A. (2008). ECG de-noising based on empirical mode decomposition. In The 9th international conference for young computer scientists, 2008. ICYCS 2008 (pp. 903–906). IEEE.Google Scholar
  174. Targ, S., Almeida, D., & Lyman, K. (2016). Resnet in resnet: Generalizing residual architectures. arXiv preprintarXiv:1603.08029.Google Scholar
  175. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.Google Scholar
  176. Tsirka, V., Simos, P. G., Vakis, A., Kanatsouli, K., Vourkas, M., Erimaki, S., et al. (2011). Mild traumatic brain injury: Graph-model characterization of brain networks for episodic memory. International Journal of Psychophysiology, 79(2), 89–96.Google Scholar
  177. van Buuren, S., & Groothuis-Oudshoorn, K. (2010). mice: Multivariate imputation by chained equations in R. Journal of Statistical Software, 45, 1–68.Google Scholar
  178. van Grinsven, M. J. J. P., van Ginneken, B., Hoyng, C. B., Theelen, T., & Sánchez, C. I. (2016). Fast convolutional neural network training using selective data sampling: Application to hemorrhage detection in color fundus images. IEEE Transactions on Medical Imaging, 35(5), 1273–1284.Google Scholar
  179. Vapnik, V. N., & Lerner, A. Y. (1963). Recognition of patterns with help of generalized portraits. Avtomat. i Telemekh, 24(6), 774–780.Google Scholar
  180. Vasconcelos, C. N., & Vasconcelos, B. N. (2017). Increasing deep learning melanoma classification by classical and expert knowledge based image transforms. CoRR, arXiv:abs/1702.07025.
  181. Waldrop, M. M. (2016). More than moore. Nature, 530(7589), 144–148.Google Scholar
  182. Wang, W., Liu, Q.-H., Cai, S.-M., Tang, M., Braunstein, L. A., & Stanley, H. E. (2016). Suppressing disease spreading by using information diffusion on multiplex networks. Scientific Reports, 6, 29259.Google Scholar
  183. Wang, X., Fan, N., & Pardalos, P. M. (2018). Robust chance-constrained support vector machines with second-order moment information. Annals of Operations Research, 263(1–2), 45–68.Google Scholar
  184. Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of small-worldnetworks. Nature, 393(6684), 440.Google Scholar
  185. Webb, A., & Kagadis, G. C. (2003). Introduction to biomedical imaging. Medical Physics, 30(8), 2267–2267.Google Scholar
  186. White, J. G., Southgate, E., Thomson, J. N., & Brenner, S. (1986). The structure of the nervous system of the nematode caenorhabditis elegans. Philosophical Transaction of the Royal Society of London B Biology Science, 314(1165), 1–340.Google Scholar
  187. Wong, H. R., Lindsell, C. J., Pettilä, V., Meyer, N. J., Thair, S. A., Karlsson, S., et al. (2014). A multibiomarker-based outcome risk stratification model for adult septic shock. Critical Care Medicine, 42(4), 781.Google Scholar
  188. Wong, S. C., Gatt, A., Stamatescu, V., & McDonnell, M. D. (2016). Understanding data augmentation for classification: When to warp? In 2016 international conference on digital image computing: techniques and applications (DICTA) (pp. 1–6). IEEE.Google Scholar
  189. Xu, Y., Jia, R., Mou, L., Li, G., Chen, Y., Lu, Y., & Jin, Z. (2016). Improved relation classification by deep recurrent neural networks with data augmentation. In COLING.Google Scholar
  190. Yao, D. (2001). A method to standardize a reference of scalp EEG recordings to a point at infinity. Physiological Measurement, 22(4), 693.Google Scholar
  191. Yu, Y., Su, R., Wang, L., Qi, W., & He, Z. (2010). Comparative QSAR modeling of antitumor activity of ARC-111 analogues using stepwise MLR, PLS, and ANN techniques. Medicinal Chemistry Research, 19(9), 1233–1244.Google Scholar
  192. Zhang, D., Wang, Y., Zhou, L., Yuan, H., Shen, D., Initiative, A. D. N., et al. (2011). Multimodal classification of alzheimer’s disease and mild cognitive impairment. Neuroimage, 55(3), 856–867.Google Scholar
  193. Zhao, X.-M., Li, X., Chen, L., & Aihara, K. (2008). Protein classification with imbalanced data. Proteins: Structure, Function, and Bioinformatics, 70(4), 1125–1132.Google Scholar
  194. Zhou, J., Greicius, M. D., Gennatas, E. D., Growdon, M. E., Jang, J. Y., Rabinovici, G. D., et al. (2010). Divergent network connectivity changes in behavioural variant frontotemporal dementia and alzheimers disease. Brain, 133(5), 1352–1367.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Center for Applied OptimizationUniversity of FloridaGainesvilleUSA
  2. 2.Laboratory of Algorithms and Technologies for Network AnalysisNational Research University Higher School of EconomicsNizhny NovgorodRussia
  3. 3.Department of Business AdministrationTechnological Educational Institute of Central MacedoniaSerresGreece

Personalised recommendations