Optimizing Body Region Classification with Deep Convolutional Activation Features

  • Obioma PelkaEmail author
  • Felix Nensa
  • Christoph M. Friedrich
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11132)


The goal of this work is to automatically apply generated image keywords as text representations, to optimize medical image classification accuracies of body regions. To create a keyword generative model, a Long Short-Term Memory (LSTM) based Recurrent Neural Network (RNN) is adopted, which is trained with preprocessed biomedical image captions as text representation and visual features extracted using Convolutional Neural Networks (CNN). For image representation, deep convolutional activation features and Bag-of-Keypoints (BoK) features are extracted for each radiograph and combined with the automatically generated keywords. Random Forest models and Support Vector Machines are trained with these multimodal image representations, as well as just visual representation, to predict body regions. Adopting multimodal image features proves to be the better approach, as the prediction accuracy for body regions is increased.


Bag-of-Keypoints DeCaf Deep learning Multimodal representation Natural language processing Radiographs 



The work of Obioma Pelka was partially funded by a PhD grant from the University of Applied Sciences and Arts Dortmund (FHDO), Germany.


  1. 1.
    Amin, M.A., Mohammed, M.K.: Overview of the ImageCLEF 2015 medical clustering task. In: Working Notes of CLEF 2015 - Conference and Labs of the Evaluation forum, Toulouse, France, 8–11 September 2015. (2015).
  2. 2.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). Scholar
  3. 3.
    Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discovery 2(2), 121–167 (1998). Scholar
  4. 4.
    Chollet, F., et al.: Keras (2015).
  5. 5.
    Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, European Conference on Computer Vision ECCV, Prague, Czech Republic, 11–14 May 2004, pp. 1–22 (2004)Google Scholar
  6. 6.
    Donahue, J., et al.: DeCAF: a deep convolutional activation feature for generic visual recognition. In: Proceedings of the 31th International Conference on Machine Learning, ICML 2014, Beijing, China, 21–26 June 2014, pp. 647–655 (2014).
  7. 7.
    Eickhoff, C., Schwall, I., de Herrera, A.G.S., Müller, H.: Overview of ImageCLEFcaption 2017 - image caption prediction and concept detection for biomedical images. In: Working Notes of CLEF 2017 - Conference and Labs of the Evaluation Forum, Dublin, Ireland, 11–14 September 2017Google Scholar
  8. 8.
    de Herrera, A.G.S., Schaer, R., Bromuri, S., Müller, H.: Overview of the ImageCLEF 2016 medical task. In: Working Notes of CLEF 2016 - Conference and Labs of the Evaluation forum, Évora, Portugal, 5–8 September, 2016. CEUR-WS Proceedings Notes, vol. 1609, pp. 219–232 (2016).
  9. 9.
    Jolliffe, I.T.: Principal component analysis. In: International Encyclopedia of Statistical Science, pp. 1094–1096 (2011)CrossRefGoogle Scholar
  10. 10.
    Kalpathy-Cramer, J., de Herrera, A.G.S., Demner-Fushman, D., Antani, S.K., Bedrick, S., Müller, H.: Evaluating performance of biomedical image retrieval systems - an overview of the medical image retrieval task at ImageCLEF 2004–2013. Comput. Med. Imaging Graph. 39, 55–61 (2015). Scholar
  11. 11.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)CrossRefGoogle Scholar
  12. 12.
    Pelka, O., Friedrich, C.M.: FHDO biomedical computer science group at medical classification task of ImageCLEF 2015. In: Working Notes of CLEF 2015 - Conference and Labs of the Evaluation forum, Toulouse, France, 8–11 September 2015.
  13. 13.
    Pelka, O., Friedrich, C.M.: Modality prediction of biomedical literature images using multimodal feature representation. GMS Med. Inform. Biom. Epidemiol. 12(2), 1345–1359 (2016). Scholar
  14. 14.
    Pelka, O., Nensa, F., Friedrich, C.M.: Adopting semantic information of grayscale radiographs for image classification and retrieval. In: Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018), BIOIMAGING, Funchal, 19–21 January 2018, vol. 2, pp. 179–187 (2018).
  15. 15.
    Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015). Scholar
  16. 16.
    Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988). Scholar
  17. 17.
    Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill Computer Science Series. McGraw-Hill, New York (1983)zbMATHGoogle Scholar
  18. 18.
    Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 2818–2826 (2016).
  19. 19.
    Vedaldi, A., Fulkerson, B.: VLFEAT: an open and portable library of computer vision algorithms. In: Proceedings of the 18th International Conference on Multimedia 2010, Firenze, Italy, 25–29 October 2010, pp. 1469–1472 (2010).

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of Applied Sciences and Arts Dortmund (FHDO)DortmundGermany
  2. 2.Faculty of MedicineUniversity of Duisburg-EssenEssenGermany
  3. 3.Department of Diagnostic and Interventional Radiology and NeuroradiologyUniversity Hospital EssenEssenGermany
  4. 4.Institute for Medical Informatics, Biometry and Epidemiology (IMIBE)University Hospital EssenEssenGermany

Personalised recommendations