Real-time Recognition and Pursuit in Robots Based on 3D Depth Data

  • Somar Boubou
  • Hamed Jabbari Asl
  • Tatsuo Narikiyo
  • Michihiro Kawanishi
Article

Abstract

In this work, we address the problem of robot pursuit based on a real-time object recognition system with 3D depth sensors. Compared with traditional RGBD data based recognition approaches, we propose a novel global online descriptor designed for object recognition from solely depth data. Proposed descriptor, which we name as Differential Histogram of Normal Vectors (DHONV), is designed to extract the geometric characteristics of the captured 3D surfaces of the objects presented in depth images. In order to obtain a brief description of the visible 3D surfaces of each object, we quantize the differential angles of the surface’s normal vectors into a 1D histogram. The object recognition experiments on a self-collected dataset and a benchmark RGB-D object dataset show that our proposed descriptor outperforms other depth data based descriptors. Moreover, we conducted real-time experiments with RoboCars. Our experiments with RoboCars validate our proposed method capability to perform a real-time recognition and pursuit tasks within indoor environment based solely on depth data.

Keywords

Object recognition 3D Depth sensors Robots SVM 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Andersen, M.R., Jensen, T., Lisouski, P., Mortensen, A.K., Hansen, M.K., Gregersen, T., Ahrendt, P.: Kinect depth sensor evaluation for computer vision applications. Tech. Rep. Electron. Comput. Eng. 1(6), 1–35 (2015)Google Scholar
  2. 2.
    Blum, M., Springenberg, J., Wulfing, J., Riedmiller, M.: A learned feature descriptor for object recognition in RGB-D data. In: Proceedings of 2012 IEEE International Conference on Robotics and Automation (ICRA), pp. 1298–1303 (2012)Google Scholar
  3. 3.
    Bo, L., Ren, X., Fox, D.: Depth kernel descriptors for object recognition. In: Proceedings of 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 821–826 (2011a)Google Scholar
  4. 4.
    Bo, L., Ren, X., Fox, D.: Hierarchical matching pursuit for image classification: Architecture and fast algorithms. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), vol. 1, pp. 2115–2123 (2011b)Google Scholar
  5. 5.
    Boubou, S., Suzuki, E.: Classifying actions based on histogram of oriented velocity vectors. J. Intell. Inf. Syst. 44(1), 49–65 (2014)CrossRefGoogle Scholar
  6. 6.
    Boubou, S., Abdul Hafez, A., Suzuki, E.: Visual impression localization of autonomous robots. In: Proceedings of 2015 IEEE International Conference on Automation Science and Engineering (CASE), pp. 328–334 (2015)Google Scholar
  7. 7.
    Boubou, S., Narikiyo, T., Kawanishi, M.: Differential histogram of normal vectors for object recognition with depth sensors. In: Proceedings of 2016 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp. 162–167 (2016)Google Scholar
  8. 8.
    Cai, Q., Gallup, D., Zhang, C., Zhang, Z.: 3D deformable face tracking with a commodity depth camera. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 229–242 (2010)Google Scholar
  9. 9.
    Campbell, R.J., Flynn, P.J.: A survey of free-form object representation and recognition techniques. Comput. Vis. Image Underst. 81(2), 166–210 (2001)CrossRefMATHGoogle Scholar
  10. 10.
    Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 886–893 (2005)Google Scholar
  11. 11.
    De Luca, A., Oriolo, G., Samson, C.: Feedback control of a nonholonomic car-like robot. In: Proceedings of robot motion planning and control, pp. 171–253 (1998)Google Scholar
  12. 12.
    Del Rio, F.D., Jimenez, G., Sevillano, J.L., Vicente, S., Balcells, A.C.: A generalization of path following for mobile robots. In: Proceedings of 1999 IEEE International Conference on Robotics and Automation (ICRA), vol. 1, pp. 7–12 (1999)Google Scholar
  13. 13.
    Drost, B., Ulrich, M., Navab, N., Ilic, S.: Model globally, match locally: Efficient and robust 3D object recognition. In: Proceedings of 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 998–1005 (2010)Google Scholar
  14. 14.
    Du, H., Henry, P., Ren, X., Cheng, M., Goldman, D.B., Seitz, S.M., Fox, D.: Interactive 3D modeling of indoor environments with a consumer depth camera. In: Proceedings of the 13th international conference on Ubiquitous computing, pp. 75–84 (2011)Google Scholar
  15. 15.
    Endres, F., Plagemann, C., Stachniss, C., Burgard, W.: Unsupervised Discovery of Object Classes from Range Data Using Latent Dirichlet Allocation. In: Robotics: Science and Systems, vol. 2, pp. 113–120 (2009)Google Scholar
  16. 16.
    Ess, A., Leibe, B., Gool, L.V.: Depth and appearance for mobile scene analysis. In: Proceedings of the 11th IEEE International Conference on Computer Vision (ICCV), pp. 1–8 (2007)Google Scholar
  17. 17.
    Gelfand, N., Mitra, N.J., Guibas, L.J., Pottmann, H.: Robust Global Registration. In: Proceedings of the Third Eurographics Symposium on Geometry Processing (SGP), vol. 2, pp. 197–206 (2005)Google Scholar
  18. 18.
    Guo, Y., Bennamoun, M., Sohel, F., Lu, M., Wan, J.: 3D object recognition in cluttered scenes with local surface features: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 36(11), 2270–2287 (2014)CrossRefGoogle Scholar
  19. 19.
    Hartmann, J., Forouher, D., Litza, M., Kluessendorff, J.H., Maehle, E.: Real-time visual slam using FastSLAM and the microsoft kinect camera. In: Proceedings of the 7th German Conference on Robotics (ROBOTIK), pp. 1–6 (2012)Google Scholar
  20. 20.
    Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: RGB-D mapping: using kinect-style depth cameras for dense 3D modeling of indoor environments. Int. J. Robot. Res. 31(5), 647–663 (2012)CrossRefGoogle Scholar
  21. 21.
    Herbst, E., Ren, X., Fox, D.: RGB-D object discovery via multi-scene analysis. In: Proceedings of 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4850–4856 (2011)Google Scholar
  22. 22.
    Hinterstoisser, S., Holzer, S., Cagniart, C., Ilic, S., Konolige, K., Navab, N., Lepetit, V.: Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: Proceedings of 2011 IEEE International Conference on Computer Vision (ICCV), pp. 858–865 (2011)Google Scholar
  23. 23.
    Hinterstoisser, S., Lepetit, V., Ilic, S., Holzer, S., Bradski, G., Konolige, K., Navab, N.: Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. In: Proceedings of Asian Conference on Computer Vision (ACCV), pp. 548–562 (2012)Google Scholar
  24. 24.
    Ikemura, S., Fujiyoshi, H.: Real-time human detection using relational depth similarity features. In: Proceedings of 2010 Asian Conference on Computer Vision (ACCV), pp. 25–38 (2010)Google Scholar
  25. 25.
    Johnson, A.E., Hebert, M.: Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Trans. Pattern Anal. Mach. Intell. 21(5), 433–449 (1999)CrossRefGoogle Scholar
  26. 26.
    Karpathy, A., Miller, S., Fei-Fei, L.: Object discovery in 3D scenes via shape analysis. In: Proceedings of 2013 IEEE International Conference on Robotics and Automation (ICRA), pp. 2088–2095 (2013)Google Scholar
  27. 27.
    Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view RGB-D object dataset. In: Proceedings of 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 1817–1824 (2011a)Google Scholar
  28. 28.
    Lai, K., Bo, L., Ren, X., Fox, D.: Sparse distance learning for object recognition combining rgb and depth information. In: Proceedings of 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 4007–4013 (2011b)Google Scholar
  29. 29.
    Mamic, G., Bennamoun, M.: Representation and recognition of 3D free-form objects. Digit. Signal Process. 12(1), 47–76 (2002)CrossRefMATHGoogle Scholar
  30. 30.
    Mian, A., Bennamoun, M., Owens, R.: Three-dimensional model-based object recognition and segmentation in cluttered scenes. Proc. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1584–1601 (2006)CrossRefGoogle Scholar
  31. 31.
    Mian, A.S., Bennamoun, M., Owens, R.A.: Automatic correspondence for 3D modeling: an extensive review. Int. J. Shape Model. 11(02), 253–291 (2005)CrossRefMATHGoogle Scholar
  32. 32.
    Nguyen, C.V., Izadi, S., Lovell, D.: Modeling kinect sensor noise for improved 3d reconstruction and tracking. In: Proceedings of 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission, pp. 524–530 (2012)Google Scholar
  33. 33.
    Oreifej, O., Liu, Z.: HON4D: Histogram of oriented 4D normals for activity recognition from depth sequences. In: Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 716–723 (2013)Google Scholar
  34. 34.
    Park, I.K., Germann, M., Breitenstein, M.D., Pfister, H.: Fast and Automatic Object Pose Estimation for Range Images on the GPU. Mach. Vis. Appl. 21, 749–766 (2010)CrossRefGoogle Scholar
  35. 35.
    Rabbani, T., Heuvel, F.V.D.: Efficient hough transform for automatic detection of cylinders in point clouds. In: Proceedings of the 11th Annual Conference of the Advanced School for Computing and Imaging (ASCI), vol. 3, pp. 60–65 (2005)Google Scholar
  36. 36.
    Rusu, R., Blodow, N., Beetz, M.: Fast point feature histograms (FPFH) for 3D registration. In: Proceedings of 2009 IEEE International Conference on Robotics and Automation (ICRA), pp. 3212–3217 (2009)Google Scholar
  37. 37.
    Rusu, R.B., Bradski, G., Thibaux, R., Hsu, J.: Fast 3D recognition and pose using the viewpoint feature histogram. In: Proceedings of 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2155–2162 (2010)Google Scholar
  38. 38.
    Sabata, B., Arman, F., Aggarwal, J.K.: Segmentation of 3D range images using pyramidal data structures. CVGIP: Image Underst. 57(3), 373–387 (1993)CrossRefGoogle Scholar
  39. 39.
    Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56(1), 116–124 (2013)CrossRefGoogle Scholar
  40. 40.
    Silberman, N., Fergus, R.: Indoor scene segmentation using a structured light sensor. In: Proceedings of 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 601–608 (2011)Google Scholar
  41. 41.
    Tang, J., Miller, S., Singh, A., Abbeel, P.: A textured object recognition pipeline for color and depth image data. In: Proceedings of 2012 IEEE International Conference on Robotics and Automation (ICRA), pp. 3467–3474 (2012a)Google Scholar
  42. 42.
    Tang, S., Wang, X., Lv, X., Han, T.X., Keller, J., He, Z., Skubic, M., Lao, S.: Histogram of oriented normal vectors for object recognition with a depth sensor. In: Proceedings of 2012 Asian Conference on Computer Vision (ACCV), pp. 525–538 (2012b)Google Scholar
  43. 43.
    Vemuri, B.C., Mitiche, A., Aggarwal, J.K.: Curvature-based representation of objects from range data. Image Vis. Comput. 4(2), 107–114 (1986)CrossRefGoogle Scholar
  44. 44.
    Wahl, E., Hillenbrand, U., Hirzinger, G.: Surflet-pair-relation histograms: A statistical 3d-shape representation for rapid classification. In: Proceedings of the Fourth International Conference on 3-D Digital Imaging and Modeling (3DIM), pp. 474–481 (2003)Google Scholar
  45. 45.
    Xia, L., Chen, C.C., Aggarwal, J.K.: Human detection using depth information by kinect. In: Proceedings of 2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 15–22 (2011)Google Scholar
  46. 46.
    Yunqi, L., Haibin, L., Xutuan, J.: 3D face recognition by SURF operator based on depth image. In: Proceedings of the 3rd IEEE International Conference on Computer Science and Information Technology (ICCSIT), vol. 9, pp. 240–244 (2010)Google Scholar
  47. 47.
    Zhang, L., Shen, P., Ding, J., Song, J., Liu, J., Yi, K.: An improved RGB-D SLAM algorithm based on kinect sensor. In: Proceedings of 2015 IEEE International Conference on Advanced Intelligent Mechatronics (AIM), pp. 555–562 (2015)Google Scholar
  48. 48.
    Zhu, Y., Fujimura, K.: 3D Head Pose Estimation with Optical Flow and Depth constraints. In: Proceedings of the Fourth International Conference on 3-D Digital Imaging and Modeling (3DIM), pp. 211–216 (2003)Google Scholar

Copyright information

© Springer Science+Business Media B.V., part of Springer Nature 2018

Authors and Affiliations

  • Somar Boubou
    • 1
  • Hamed Jabbari Asl
    • 1
  • Tatsuo Narikiyo
    • 1
  • Michihiro Kawanishi
    • 1
  1. 1.Department of Advanced Science and TechnologyToyota Technological InstituteNagoyaJapan

Personalised recommendations