Skip to main content

Visual Object Detection for an Autonomous Indoor Robotic System

  • Conference paper
  • First Online:
Proceedings of 2nd International Conference on Computer Vision & Image Processing

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 703))

Abstract

This paper discusses an indoor robotic system that integrates a state-of-the-art object detection algorithm trained with data augmented for an indoor scenario and enabled with mechanisms to localize and position objects in 3D and display them interactively to a user. Size, weight, and power constraints in a mobile robot constrain the type of computing hardware that can be integrated with the robotic platform. However, on the other hand, the robot’s mobility if leveraged properly can provide enough opportunity to detect objects from different distances and viewpoints as the robot approaches them giving more robust results. This work adapts a CNN-based algorithm, YOLO, to run on a GPU-enabled board, the Jetson TX1. An innovative method to calculate the object position in the 3D environment map is discussed along with the problems therein, such as that of duplicate detections that need to be suppressed. Since multiple objects of different or same class may be detected, the user is overloaded with information and management of the visualization through human–machine interaction gains an important role. A scheme for informative display of objects is implemented which lets the user interactively view object images as well as their position in the scene. The complete robotic system including the interactive visualization tool can be put to various uses such as search and rescue, indoor assistance, patrolling and surveillance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 199.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 259.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. A. Geiger, P. Lenz, and R. Urtasun. Are we ready for autonomous driving? the KITTI vision benchmark suite. In CVPR, 2012.

    Google Scholar 

  2. Menglong Zhu, Konstantinos G. Derpanis, Yinfei Yang, Samarth Brahmbhatt, Mabel Zhang, Cody Phillips, Matthieu Lecce and Kostas Daniilidis, Single Image 3D Object Detection and Pose Estimation for Grasping, ICRA, 2014.

    Google Scholar 

  3. Ian Lenz, Honglak Lee and Ashutosh Saxena, Deep Learning for Detecting Robotic Grasps, arXiv 2014.

    Google Scholar 

  4. Ling Cai, Lei He, Yiren Xu, Yuming Zhao, Xin Yang, Multi-object detection and tracking by stereo vision, Pattern Recognition, 2010.

    Google Scholar 

  5. Arjun Singh, James Sha, Karthik S. Narayan, Tudor Achim, Pieter Abbeel, BigBIRD: A Large-Scale 3D Database of Object Instances, ICRA, 2014.

    Google Scholar 

  6. Omid Hosseini Jafari, Dennis Mitzel, Bastian Leibe, Real-Time RGB-D based People Detection and Tracking for Mobile Robots and Head-Worn Cameras, ICRA, 2014.

    Google Scholar 

  7. Pierre Sermanet, David Eigen, Xiang Zhang, Michael Mathieu, Rob Fergus, Yann Le Cun, OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks, arXiv, 2014.

    Google Scholar 

  8. Saurabh Gupta, Ross Girshick, Pablo Arbelaez, and Jitendra Malik, Learning Rich Features from RGB-D Images for Object Detection and Segmentation, arXiv, 2014.

    Google Scholar 

  9. Yulan Guo, Mohammed Bennamoun, Ferdous Sohel, Min Lu, and Jianwei Wan, 3D Object Recognition in Cluttered Scenes with Local Surface Features: A Survey, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 11, November 2014.

    Google Scholar 

  10. Christian Szegedy, Alexander Toshev, Dumitru Erhan, Deep Neural Networks for Object Detection, NIPS, 2013.

    Google Scholar 

  11. Dumitru Erhan, Christian Szegedy, Alexander Toshev, and Dragomir Anguelov, Scalable Object Detection using Deep Neural Networks, CVPR, 2014.

    Google Scholar 

  12. Yu Xiang, Roozbeh Mottaghi, Silvio Savarese, Beyond PASCAL: A Benchmark for 3D Object Detection in the Wild, WACV, 2014.

    Google Scholar 

  13. Xiaozhi Chen, Kaustav Kundu, Yukun Zhu, Andrew Berneshawi, Huimin Ma, SanjaFidler, Raquel Urtasun, 3D Object Proposals for Accurate Object Class Detection, NIPS, 2015.

    Google Scholar 

  14. Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi, You Only Look Once: Unified, Real-Time Object Detection, CVPR, 2016.

    Google Scholar 

  15. Navneet Dalal and Bill Triggs, Histograms of Oriented Gradients for Human Detection, CVPR 2005.

    Google Scholar 

  16. Pedro F. Felzenszwalb, Ross B. Girshick, David McAllester and Deva Ramanan, Object Detection with Discriminatively Trained Part Based Models, PAMI 2010.

    Google Scholar 

  17. J. Dong, Q. Chen, S. Yan, and A. Yuille. Towards unified object detection and semantic segmentation. In Computer Vision–ECCV 2014, pages 299–314. Springer, 2014.

    Google Scholar 

  18. Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, CVPR, 2014.

    Google Scholar 

  19. Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun, Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, NIPS, 2015.

    Google Scholar 

  20. Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik, Region-Based Convolutional Networks for Accurate Object Detection and Segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 1, January 2016.

    Google Scholar 

  21. M. Everingham, S. M. A. Eslami, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The pascal visual object classes challenge: A retrospective. International Journal of Computer Vision, 111(1):98–136, Jan. 2015.

    Google Scholar 

  22. Khaled Alhamzi, Mohammed Elmogy, Sherif Barakat, 3D Object Recognition Based on Local and Global Features Using Point Cloud Library, IJACT, 2015.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anima M. Sharma .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sharma, A.M., Syed, I.A., Sharma, B., Jamal, A., Deodhare, D. (2018). Visual Object Detection for an Autonomous Indoor Robotic System. In: Chaudhuri, B., Kankanhalli, M., Raman, B. (eds) Proceedings of 2nd International Conference on Computer Vision & Image Processing . Advances in Intelligent Systems and Computing, vol 703. Springer, Singapore. https://doi.org/10.1007/978-981-10-7895-8_17

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-7895-8_17

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-7894-1

  • Online ISBN: 978-981-10-7895-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics