Skip to main content

Object Detection and Localization Using Deep Convolutional Networks with Softmax Activation and Multi-class Log Loss

  • Conference paper
  • First Online:
Image Analysis and Recognition (ICIAR 2016)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9730))

Included in the following conference series:

Abstract

We introduce a deep neural network that can be used to localize and detect a region of interest (ROI) in an image. We show how this network helped us extract ROIs when working on two separate problems: a whale recognition problem and a heart volume estimation problem. In the former problem, we used this network to localize the head of the whale while in the later we used it to localize the heart left ventricle from MRI images. Most localization networks regress a bounding box around the region of interest. Unlike these architecture, we treat the problem as a classification problem where each pixel in the image is a separate class. The network is trained on images along with masks which indicate where the object is in the image. We treat the problem as a multi-class classification. Therefore, the last layer has a softmax activation. Furthermore, during training, the mutli-class log loss is minimized just like any classification task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Data science bowl cardiac challenge data. https://www.kaggle.com/c/second-annual-data-science-bowl. (Accessed on 19 March 2016)

  2. Right whale recognition. https://www.kaggle.com/c/noaa-right-whale-recognition. (Accessed on 19 January 2016)

  3. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 248–255. IEEE (2009)

    Google Scholar 

  4. Erhan, D., Szegedy, C., Toshev, A., Anguelov, D.: Scalable object detection using deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2147–2154 (2014)

    Google Scholar 

  5. Fukushima, K.: Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36(4), 193–202 (1980)

    Article  MATH  Google Scholar 

  6. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)

    Google Scholar 

  7. Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012)

  8. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105 (2012)

    Google Scholar 

  9. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  10. Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pp. 807–814 (2010)

    Google Scholar 

  11. Otsu, N.: A threshold selection method from gray-level histograms. Automatica 11(285–296), 23–27 (1975)

    Google Scholar 

  12. Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: Advances in Neural Information Processing Systems, pp. 2553–2561 (2013)

    Google Scholar 

  13. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  14. Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv:1312.6229 (2013)

  15. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  16. Song, H.O., Girshick, R., Jegelka, S., Mairal, J., Harchaoui, Z., Darrell, T.: On learning to localize objects with minimal supervision. arXiv preprint arXiv:1403.1024 (2014)

  17. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  18. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. arXiv preprint arXiv:1409.4842 (2014)

  19. Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: Advances in Neural Information Processing Systems, pp. 2553–2561 (2013)

    Google Scholar 

Download references

Acknowledgements

This research is partially funded by the Natural Sciences and Engineering Research Council of Canada (NSERC). This support is greatly appreciated. We would also like to thank kaggle, the National Oceanic Atmospheric Administration Fisheries for providing the whale data set. We would also like to thank Booz Allen Hamilton, and the National Heart, Lung, and Blood Institute (NHLBI) for providing the MRI images.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mahmoud R. El-Sakka .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Kabani, A., El-Sakka, M.R. (2016). Object Detection and Localization Using Deep Convolutional Networks with Softmax Activation and Multi-class Log Loss. In: Campilho, A., Karray, F. (eds) Image Analysis and Recognition. ICIAR 2016. Lecture Notes in Computer Science(), vol 9730. Springer, Cham. https://doi.org/10.1007/978-3-319-41501-7_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41501-7_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41500-0

  • Online ISBN: 978-3-319-41501-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics