Abstract
While detecting object becomes easier with deep models, estimating pose remains a challenging problem in modern vision research. In this work, we propose a method that enables detecting objects and estimating their pose simultaneously in a single model, without intermediate stages. Unlike some other approaches, we make the first attempt to hierarchically estimate objects pose using a deep network. We approach the problem by trying to mimic human perception in seeing objects, as we tend to estimate the object pose based on our experience and usually from coarse to fine fashion. We with enough resource to build a deeper model, our approach will be able to produce more accurate results for to complex tasks including object tracking, localization, and SLAM [2].
Q. A. Dang and Q. M. B. Nguyen—Contributed equally to this manuscript.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cheng, C., Chen, H., Lee, T., Lai, S., Tsai, Y.: Robust 3D object pose estimation from a single 2D image. In: 2011 Visual Communications and Image Processing (VCIP), pp. 1–4, November 2011
Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping: part I. IEEE Robot. Autom. Mag. 13(2), 99–110 (2006)
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: Large-scale machine learning on heterogeneous systems. tensorflow.org (2015)
Everingham, M., Eslami, S.M.A., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. CoRR (2013)
Huttenlocher, D.P., Ullman, S.: Recognizing solid objects by alignment with an image. Int. J. Comput. Vis. 5(2), 195–212 (1990)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (2012)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E., Fu, C.Y., Berg, A.C.: SSD: single shot multibox detector. CoRR (2015)
Poirson, P., Ammirato, P., Fu, C., Liu, W., Kosecka, J., Berg, A.C.: Fast single shot detection and pose estimation. CoRR abs/1609.05590 (2016)
Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: Unified, real-time object detection. CoRR (2015)
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with regional proposal networks. CoRR (2015)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR (2014)
Su, H., Qi, C.R., Li, Y., Guibas, L.J.: Render for CNN: viewpoint estimation in images using CNNs trained with rendered 3D model views. CoRR, http://arxiv.org/abs/1505.05641 (2015)
Tulsiani, S., Malik, J.: Viewpoints and keypoints. CoRR, http://arxiv.org/abs/1411.6067 (2014)
Xiang, Y., Mottaghi, R., Savarese, S.: Beyond pascal: a benchmark for 3D object detection in the wild. In: 2014 IEEE Winter Conference on Applications of Computer Vision (WACV) (2014)
Zhu, M., Derpanis, K.G., Yang, Y., Brahmbhatt, S., Zhang, M., Phillips, C., Lecce, M., Daniilidis, K.: Single image 3D object detection and pose estimation for grasping. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 3936–3943, May 2014
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Dang, Q.A., Nguyen, Q.M.B., Nguyen, D.D. (2019). Azimuth Angle Detection with Single Shot MultiBox Detecting Model. In: Lee, S., Ismail, R., Choo, H. (eds) Proceedings of the 13th International Conference on Ubiquitous Information Management and Communication (IMCOM) 2019. IMCOM 2019. Advances in Intelligent Systems and Computing, vol 935. Springer, Cham. https://doi.org/10.1007/978-3-030-19063-7_31
Download citation
DOI: https://doi.org/10.1007/978-3-030-19063-7_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-19062-0
Online ISBN: 978-3-030-19063-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)