A Multi-armed Bandit to Smartly Select a Training Set from Big Medical Data

  • Benjamín GutiérrezEmail author
  • Loïc Peter
  • Tassilo Klein
  • Christian Wachinger
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10435)


With the availability of big medical image data, the selection of an adequate training set is becoming more important to address the heterogeneity of different datasets. Simply including all the data does not only incur high processing costs but can even harm the prediction. We formulate the smart and efficient selection of a training dataset from big medical image data as a multi-armed bandit problem, solved by Thompson sampling. Our method assumes that image features are not available at the time of the selection of the samples, and therefore relies only on meta information associated with the images. Our strategy simultaneously exploits data sources with high chances of yielding useful samples and explores new data regions. For our evaluation, we focus on the application of estimating the age from a brain MRI. Our results on 7,250 subjects from 10 datasets show that our approach leads to higher accuracy while only requiring a fraction of the training data.



This work was supported in part by SAP SE, the Faculty of Medicine at LMU (FöFoLe), and the Bavarian State Ministry of Education, Science and the Arts in the framework of the Centre Digitisation.Bavaria (ZD.B).


  1. 1.
    Bouneffouf, D., Laroche, R., Urvoy, T., Feraud, R., Allesiardo, R.: Contextual bandit for active learning: active thompson sampling. In: Loo, C.K., Yap, K.S., Wong, K.W., Teoh, A., Huang, K. (eds.) ICONIP 2014. LNCS, vol. 8834, pp. 405–412. Springer, Cham (2014). doi: 10.1007/978-3-319-12637-1_51 CrossRefGoogle Scholar
  2. 2.
    Buckner, R., Hollinshead, M., Holmes, A., Brohawn, D., Fagerness, J., O’Keefe, T., Roffman, J.: The brain genomics superstruct project. Harvard Dataverse Network (2012)Google Scholar
  3. 3.
    Di Martino, A., Yan, C., et al.: The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism. Mol. Psychiatry 19(6), 659–667 (2014)CrossRefGoogle Scholar
  4. 4.
    Ellis, K., Bush, A., Darby, D., et al.: The Australian imaging, biomarkers and lifestyle (AIBL) study of aging: methodology and baseline characteristics of 1112 individuals recruited for a longitudinal study of alzheimer’s disease. Int. Psychogeriatr. 21(04), 672–687 (2009)CrossRefGoogle Scholar
  5. 5.
    Fischl, B., Salat, D., Busa, E., Albert, M., Dieterich, M., Haselgrove, C., van der Kouwe, A., Killiany, R., Kennedy, D., Klaveness, S., Montillo, A., Makris, N., Rosen, B., Dale, A.: Whole brain segmentation: automated labeling of neuroanatomical structures in the human brain. Neuron 33(3), 341–355 (2002)CrossRefGoogle Scholar
  6. 6.
    Franke, K., Luders, E., May, A., Wilke, M., Gaser, C.: Brain maturation: predicting individual brainage in children and adolescents using structural mri. Neuroimage 63(3), 1305–1312 (2012)CrossRefGoogle Scholar
  7. 7.
    Franke, K., Ziegler, G., Klöppel, S., Gaser, C., Alzheimer’s Disease Neuroimaging Initiative: Estimating the age of healthy subjects from t 1-weighted mri scans using kernel methods: Exploring the influence of various parameters. Neuroimage 50(3), 883–892 (2010)CrossRefGoogle Scholar
  8. 8.
    Gollub, R.L., Shoemaker, J., King, M., White, T., Ehrlich, S., Sponheim, S., Clark, V., Turner, J., Mueller, B., Magnotta, V., et al.: The mcic collection: a shared repository of multi-modal, multi-site brain image data from a clinical investigation of schizophrenia. Neuroinformatics 11(3), 367–388 (2013)CrossRefGoogle Scholar
  9. 9.
    Hoi, S., Jin, R., Zhu, J., Lyu, M.: Batch mode active learning and its application to medical image classification. In: ICML, pp. 417–424. ACM (2006)Google Scholar
  10. 10.
    Japkowicz, N., Stephen, S.: The class imbalance problem: a systematic study. Intel. Data Anal. 6(5), 429–449 (2002)zbMATHCrossRefGoogle Scholar
  11. 11.
    Marcus, D.S., Wang, T.H., Parker, J., Csernansky, J.G., Morris, J.C., Buckner, R.L.: Open access series of imaging studies (oasis): cross-sectional mri data in young, middle aged, nondemented, and demented older adults. J. Cognitive Neurosci. 19(9), 1498–1507 (2007)CrossRefGoogle Scholar
  12. 12.
    Marek, K., Jennings, D., Lasch, S., Siderowf, A., Tanner, C., Simuni, T., Coffey, C., Kieburtz, K., Flagg, E., Chowdhury, S., et al.: The parkinson progression marker initiative (PPMI). Prog. Neurobiol. 95(4), 629–635 (2011)CrossRefGoogle Scholar
  13. 13.
    Mayer, A., Ruhl, D., Merideth, F., Ling, J., Hanlon, F., Bustillo, J., Cañive, J.: Functional imaging of the hemodynamic sensory gating response in schizophrenia. Hum. Brain Mapp. 34(9), 2302–2312 (2013)CrossRefGoogle Scholar
  14. 14.
    Milham, M.P., Fair, D., Mennes, M., Mostofsky, S.H., et al.: The ADHD-200 consortium: a model to advance the translational potential of neuroimaging in clinical neuroscience. Frontiers Syst. Neurosci. 6, 62 (2012)Google Scholar
  15. 15.
    Pan, S., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)CrossRefGoogle Scholar
  16. 16.
    Shimodaira, H.: Improving predictive inference under covariate shift by weighting the log-likelihood function. J. Stat. Plan. Inference 90(2), 227–244 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Thompson, W.R.: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3/4), 285–294 (1933)CrossRefzbMATHGoogle Scholar
  18. 18.
    Valizadeh, S., Hänggi, J., Mérillat, S., Jäncke, L.: Age prediction on the basis of brain anatomical measures. Hum. Brain Mapp. 38(2), 997–1008 (2017)CrossRefGoogle Scholar
  19. 19.
    Van Essen, D.C., Smith, S.M., Barch, D.M., Behrens, T., Yacoub, E., Ugurbil, K., WU-Minn HCP Consortium, et al: The WU-Minn human connectome project: an overview. Neuroimage 80, 62–79 (2013)CrossRefGoogle Scholar
  20. 20.
    Wachinger, C., Reuter, M.: Domain adaptation for alzheimer’s disease diagnostics. Neuroimage 139, 470–479 (2016)CrossRefGoogle Scholar
  21. 21.
    Zhu, Y., Zhang, S., Liu, W., Metaxas, D.N.: Scalable histopathological image analysis via active learning. In: Golland, P., Hata, N., Barillot, C., Hornegger, J., Howe, R. (eds.) MICCAI 2014. LNCS, vol. 8675, pp. 369–376. Springer, Cham (2014). doi: 10.1007/978-3-319-10443-0_47 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Benjamín Gutiérrez
    • 1
    • 2
    Email author
  • Loïc Peter
    • 2
    • 3
  • Tassilo Klein
    • 4
  • Christian Wachinger
    • 1
  1. 1.Artificial Intelligence in Medical Imaging (AI-Med)KJP, LMU MünchenMunichGermany
  2. 2.CAMPTechnische Universität MünchenMunichGermany
  3. 3.Translational Imaging GroupUniversity College LondonLondonUK
  4. 4.SAP SE BerlinBerlinGermany

Personalised recommendations