Abstract
We introduce a deep neural network that can be used to localize and detect a region of interest (ROI) in an image. We show how this network helped us extract ROIs when working on two separate problems: a whale recognition problem and a heart volume estimation problem. In the former problem, we used this network to localize the head of the whale while in the later we used it to localize the heart left ventricle from MRI images. Most localization networks regress a bounding box around the region of interest. Unlike these architecture, we treat the problem as a classification problem where each pixel in the image is a separate class. The network is trained on images along with masks which indicate where the object is in the image. We treat the problem as a multi-class classification. Therefore, the last layer has a softmax activation. Furthermore, during training, the mutli-class log loss is minimized just like any classification task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Data science bowl cardiac challenge data. https://www.kaggle.com/c/second-annual-data-science-bowl. (Accessed on 19 March 2016)
Right whale recognition. https://www.kaggle.com/c/noaa-right-whale-recognition. (Accessed on 19 January 2016)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 248–255. IEEE (2009)
Erhan, D., Szegedy, C., Toshev, A., Anguelov, D.: Scalable object detection using deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2147–2154 (2014)
Fukushima, K.: Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36(4), 193–202 (1980)
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105 (2012)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pp. 807–814 (2010)
Otsu, N.: A threshold selection method from gray-level histograms. Automatica 11(285–296), 23–27 (1975)
Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: Advances in Neural Information Processing Systems, pp. 2553–2561 (2013)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv:1312.6229 (2013)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Song, H.O., Girshick, R., Jegelka, S., Mairal, J., Harchaoui, Z., Darrell, T.: On learning to localize objects with minimal supervision. arXiv preprint arXiv:1403.1024 (2014)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. arXiv preprint arXiv:1409.4842 (2014)
Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: Advances in Neural Information Processing Systems, pp. 2553–2561 (2013)
Acknowledgements
This research is partially funded by the Natural Sciences and Engineering Research Council of Canada (NSERC). This support is greatly appreciated. We would also like to thank kaggle, the National Oceanic Atmospheric Administration Fisheries for providing the whale data set. We would also like to thank Booz Allen Hamilton, and the National Heart, Lung, and Blood Institute (NHLBI) for providing the MRI images.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Kabani, A., El-Sakka, M.R. (2016). Object Detection and Localization Using Deep Convolutional Networks with Softmax Activation and Multi-class Log Loss. In: Campilho, A., Karray, F. (eds) Image Analysis and Recognition. ICIAR 2016. Lecture Notes in Computer Science(), vol 9730. Springer, Cham. https://doi.org/10.1007/978-3-319-41501-7_41
Download citation
DOI: https://doi.org/10.1007/978-3-319-41501-7_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41500-0
Online ISBN: 978-3-319-41501-7
eBook Packages: Computer ScienceComputer Science (R0)