Autonomous Robots

, Volume 40, Issue 1, pp 175–192 | Cite as

Construction of a 3D object recognition and manipulation database from grasp demonstrations

  • David KentEmail author
  • Morteza Behrooz
  • Sonia Chernova


Object recognition and manipulation are critical for enabling robots to operate in household environments. Many grasp planners can estimate grasps based on object shape, but they ignore key information about non-visual object characteristics. Object model databases can account for this information, but existing methods for database construction are time and resource intensive. We present an easy-to-use system for constructing object models for 3D object recognition and manipulation made possible by advances in web robotics. The database consists of point clouds generated using a novel iterative point cloud registration algorithm. The system requires no additional equipment beyond the robot, and non-expert users can demonstrate grasps through an intuitive web interface. We validate the system with data collected from both a crowdsourcing user study and expert demonstration. We show that the demonstration approach outperforms purely vision-based grasp planning approaches for a wide variety of object classes.


Object manipulation Grasp demonstration Crowdsourcing 



I would like to thank Professor Odest Chadwicke Jenkins of Brown University for allowing the use of the PR2 robot throughout the user study and validation experiments. Furthermore, I would like to thank fellow graduate students Russell Toris for the development of the RMS, and Morteza Behrooz for his help in conducting the user study. This work was supported by the National Science Foundation Award Number 1149876, CAREER: Towards Robots that Learn from Everyday People, PI Sonia Chernova, and Office of Naval Research Grant N00014-08-1-0910 PECASE: Tracking Human Movement Using Models of Physical Dynamics and Neurobiomechanics with Probabilistic Inference, PI Odest Chadwicke Jenkins.

Supplementary material

Supplementary material 1 (mp4 118021 KB)


  1. Alexander, B., Hsiao, K., Jenkins, C., Suay, B., & Toris, R. (2012). Robot web tools [ROS topics]. IEEE Robotics Automation Magazine, 19(4), 20–23.CrossRefGoogle Scholar
  2. Azevedo, T. C., Tavares, Ja M R, & Vaz, M. A. (2009). 3D object reconstruction from uncalibrated images using an off-the-shelf camera. In Ja M R Tavares & R. N. Jorge (Eds.), Advances in computational vision and medical image processing, volume 13 of computational methods in applied sciences (pp. 117–136). The Netherlands: Springer.CrossRefGoogle Scholar
  3. Baier, T., & Zhang, J. (2006). Reusability-based semantics for grasp evaluation in context of service robotics. In IEEE International Conference on Robotics and Biomimetics, 2006. ROBIO ’06, pp. 703–708.Google Scholar
  4. Breazeal, C., DePalma, N., Orkin, J., Chernova, S., & Jung, M. (2013). Crowdsourcing human-robot interaction: New methods and system evaluation in a public environment. Journal of Human-Robot Interaction, 2(1), 82–111.CrossRefGoogle Scholar
  5. Brown, M., & Lowe, D. (2005). Unsupervised 3D object recognition and reconstruction in unordered datasets. In Fifth International Conference on 3-D Digital Imaging and Modeling, 2005. 3DIM 2005, pp. 56–63.Google Scholar
  6. Chatzilari, E., Nikolopoulos, S., Papadopoulos, S., Zigkolis, C., & Kompatsiaris, Y. (2011). Semi-supervised object recognition using flickr images. In 9th International Workshop on Content-Based Multimedia Indexing (CBMI), 2011, pp. 229–234.Google Scholar
  7. Chernova, S., DePalma, N., Morant, E., & Breazeal, C. (2011). Crowdsourcing human-robot interaction: Application from virtual to physical worlds. In IEEE International Symposium on Robot and Human Interactive Communication, Ro-Man ’11.Google Scholar
  8. Chung, M., Forbes, M., Cakmak, M., & Rao, R. (2014). Accelerating imitation learning through crowdsourcing. In IEEE International Conference on Robotics and Automation (ICRA), pp. 4777–4784.Google Scholar
  9. Crick, C., Jay, G., Osentoski, S., & Jenkins, O. (2012). ROS and rosbridge: Roboticists out of the loop. In th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 493–494.Google Scholar
  10. Crick, C., Osentoski, S., Jay, G., & Jenkins, O. (2011). Human and robot perception in large-scale learning from demonstration. In ACM/IEEE International Conference on Human-Robot Interaction (HRI 2011).Google Scholar
  11. Detry, R., Kraft, D., Kroemer, O., Bodenhagen, L., Peters, J., Krüger, N., et al. (2011). Learning grasp affordance densities. Paladyn, 2(1), 1–17.Google Scholar
  12. Esteban, C. H., & Schmitt, F. (2004). Silhouette and stereo fusion for 3D object modeling. Computer Vision and Image Understanding, 96(3), 367–392. (Special issue on model-based and image-based 3D scene representation for interactive visualization).CrossRefGoogle Scholar
  13. Garage, W. (2010). The household objects SQL Database.Google Scholar
  14. Goldfeder, C., Ciocarlie, M., Dang, H., & Allen, P. K. (2009a). The Columbia grasp database. In IEEE International Conference on Robotics and Automation, 2009. ICRA’09, pp. 1710–1716.Google Scholar
  15. Goldfeder, C., Ciocarlie, M., Peretzman, J., Dang, H., & Allen, P. (2009b). Data-driven grasping with partial sensor data. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009, pp. 1278–1283.Google Scholar
  16. Harris, C. (2011). You’re hired! An examination of crowdsourcing incentive models in human resource tasks. In WSDM Workshop on Crowdsourcing for Search and Data Mining (CSDM), pp. 15–18.Google Scholar
  17. Hsiao, K., Chitta, S., Ciocarlie, M., & Jones, E. (2010). Contact-reactive grasping of objects with partial shape information. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2010, pp. 1228–1235.Google Scholar
  18. Jiang, Y., Moseson, S., & Saxena, A. (2011). Efficient grasping from RGBD images: Learning using a new rectangle representation. In IEEE International Conference on Robotics and Automation (ICRA), 2011, pp. 3304–3311.Google Scholar
  19. Kasper, A., Xue, Z., & Dillmann, R. (2012). The KIT object models database: An object model database for object recognition, localization and manipulation in service robotics. The International Journal of Robotics Research, 31(8), 927–934.CrossRefGoogle Scholar
  20. Kehoe, B., Patil, S., Abbeel, P., & Goldberg, K. (2015). A survey of research on cloud robotics and automation. IEEE Transactions on Automation Science and Engineering, 12(2), 398–409.CrossRefGoogle Scholar
  21. Kraft, D., Pugeault, N., Baseski, E., Popovic, M., Kragic, D., Kalkan, S., et al. (2008). Birth of the object: Detection of objectness and extraction of object shape through object action complexes. Special Issue on Cognitive Humanoid Robots of the International Journal of Humanoid Robotics (IJHR), 5(2), 247–265.CrossRefGoogle Scholar
  22. Krainin, M., Curless, B., & Fox, D. (2011). Autonomous generation of complete 3d object models using next best view manipulation planning. In IEEE International Conference on Robotics and Automation (ICRA), 2011, pp. 5031–5037.Google Scholar
  23. Lai, K., Bo, L., Ren, X., & Fox, D. (2011). A large-scale hierarchical multi-view rgb-d object dataset. In 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 1817–1824.Google Scholar
  24. Makadia, A., Patterson, A., & Daniilidis, K. (2006). Fully automatic registration of 3D point clouds. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1, 1297–1304.Google Scholar
  25. Mitra, N. J., Gelfand, N., Pottmann, H., & Guibas, L. (2004). Registration of point cloud data from a geometric optimization perspective. In Proceedings of the 2004 Eurographics/ACM SIGGRAPH Symposium on Geometry Processing, SGP ’04, pp. 22–31, New York, NY, USA. ACM.Google Scholar
  26. Morales, A., Asfour, T., Azad, P., Knoop, S., & Dillmann, R. (2006). Integrated grasp planning and visual object localization for a humanoid robot with five-fingered hands. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 2006, pp. 5663–5668.Google Scholar
  27. Pitzer, B., Osentoski, S., Jay, G., Crick, C., & Jenkins, O. (2012). PR2 remote lab: An environment for remote development and experimentation. In 2012 IEEE International Conference on Robotics and Automation (ICRA), pp. 3200–3205.Google Scholar
  28. Quinlan, J. R. (1993). C4.5: Programs for machine learning. Burlington: Morgan Kaufmann.Google Scholar
  29. Rusu, R., Blodow, N., & Beetz, M. (2009). Fast point feature histograms (FPFH) for 3D registration. In IEEE International Conference on Robotics and Automation, 2009. ICRA ’09, pp. 3212–3217.Google Scholar
  30. Rusu, R., & Cousins, S. (2011). 3D is here: Point cloud library (PCL). In IEEE International Conference on Robotics and Automation (ICRA), 2011, pp. 1–4.Google Scholar
  31. Saxena, A., Wong, L. L., & Ng, A. Y. (2008). Learning grasp strategies with partial shape information. In AAAI, 8, 1491–1494.Google Scholar
  32. Sehgal, A., Cernea, D., & Makaveeva, M. (2010). Real-time scale invariant 3D range point cloud registration. In A. Campilho & M. Kamel (Eds.), Image analysis and recognition, volume 6111 of Lecture Notes in Computer Science, pp. 220–229. Springer: Berlin Heidelberg.Google Scholar
  33. Sorokin, A., Berenson, D., Srinivasa, S. S., & Hebert, M. (2010). People helping robots helping people: Crowdsourcing for grasping novel objects. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2010, pp. 2117–2122.Google Scholar
  34. Stückler, J., Steffens, R., Holz, D., & Behnke, S. (2013). Efficient 3D object perception and grasp planning for mobile manipulation in domestic environments. Robotics and Autonomous Systems, 61(10), 1106–1115. Selected Papers from the 5th European Conference on Mobile Robots (ECMR 2011).CrossRefGoogle Scholar
  35. Tellex, S., Kollar, T., Dickerson, S., Walter, M., Banerjee, A., Teller, S., & Roy, N. (2011). Understanding natural language commands for robotic navigation and mobile manipulation. In Proceedings of the National Conference on Artificial Intelligence (AAAI).Google Scholar
  36. Toris, R., Kent, D., & Chernova, S. (2014). The robot management system: A framework for conducting human-robot interaction studies through crowdsourcing. Journal of Human-Robot Interaction, 3, 25–49.Google Scholar
  37. Torralba, A., Fergus, R., & Freeman, W. T. (2008). 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11), 1958–1970.CrossRefGoogle Scholar
  38. Xue, Z., Kasper, A., Zoellner, J., & Dillmann, R. (2009). An automatic grasp planning system for service robots. In International Conference on Advanced Robotics, 2009. ICAR 2009, pp. 1–6.Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Department of Robotics EngineeringWorcester Polytechnic InstituteWorcesterUSA
  2. 2.Department of Computer ScienceWorcester Polytechnic InstituteWorcesterUSA

Personalised recommendations