Learning and Understanding Deep Spatio-Temporal Representations from Free-Hand Fetal Ultrasound Sweeps

  • Yuan GaoEmail author
  • J. Alison Noble
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11768)


Identifying structures in nonstandard fetal ultrasound planes is a significant challenge, even for human experts, due to high variability of the anatomies in terms of their appearance, scale and position but important for image interpretation and navigation. In this work, our contribution is three-fold: (i) we model local temporal dynamics of video clips, by applying convolutional LSTMs on the intermediate CNN layers, which learns to detect fetal structures at various scales; (ii) we proposed an attention-gated LSTM, which generates spatio-temporal attention maps showing the intermediate process of structure localisation; and (iii) our approach is end-to-end trainable, and the localisation is achieved in a weakly supervised fashion i.e. with only image-level labels available during training. The proposed attention-mechanism is found to improve the detection performance substantially in terms of classification precision and localisation correctness.


Spatial-temporal neural network Soft attention Weakly supervised detection Non-standard fetal scan planes 



We acknowledge the ERC (ERC-ADG-2015 694581, project PULSE) the EPSRC (EP/GO36861/1, EP/MO13774/1) the CSC (DPhil Scholarship No. 201408060107) and the NIHR Biomedical Research Centre funding scheme.

Supplementary material

490279_1_En_34_MOESM1_ESM.pdf (8.5 mb)
Supplementary material 1 (pdf 8685 KB)


  1. 1.
    Xingjian, S., Chen, Z., Wang, H., Yeung, D., Wong, W., Woo, W.: Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: NIPS (2015)Google Scholar
  2. 2.
    Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: ICML (2015)Google Scholar
  3. 3.
    Chen, H., et al.: Standard plane localization in fetal ultrasound via domain transferred deep neural networks. IEEE J. Biomed. Health Inform. 19, 1627–1636 (2015) CrossRefGoogle Scholar
  4. 4.
    Baumgartner, C.F., Kamnitsas, K., Smith, S., Koch, L.M., Kainz, B., Rueckert, D.: SonoNet: real-time detection and localisation of fetal standard scan planes in freehand ultrasound. IEEE TMI (2017)Google Scholar
  5. 5.
    Chen, H., et al.: Automatic fetal ultrasound standard plane detection using knowledge transferred recurrent neural networks. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9349, pp. 507–514. Springer, Cham (2015). Scholar
  6. 6.
    Gao, Y., Alison Noble, J.: Detection and characterization of the fetal heartbeat in free-hand ultrasound sweeps with weakly-supervised two-streams convolutional networks. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10434, pp. 305–313. Springer, Cham (2017). Scholar
  7. 7.
    Huang, W., Bridge, C.P., Noble, J.A., Zisserman, A.: Temporal HeartNet: towards human-level automatic analysis of fetal cardiac screening video. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10434, pp. 341–349. Springer, Cham (2017). Scholar
  8. 8.
    Schlemper, J., et al.: Attention-gated networks for improving ultrasound scan plane detection. In: MIDL (2018)Google Scholar
  9. 9.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2014)Google Scholar
  10. 10.
    Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. In: Proceedings of ICCV (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Biomedical Image Analysis Group, Institute of Biomedical Engineering, Department of Engineering ScienceUniversity of OxfordOxfordUK

Personalised recommendations