Abstract
To allow robots to accomplish manipulation work effectively, one of the critical functions they need is to precisely and robustly recognize the robotic graspable object and the category of the graspable objects, especially in data limited condition. In this paper, we propose a novel multi-loss hierarchical representations learning framework that is capable of recognizing the category of graspable objects in a coarse-to-fine way. Our model consists of two main components, an efficient hierarchical feature learning component that combines kernel features with the deep learning features and a multi-loss function that optimizes the multi-task learning mechanism in a coarse-to-fine way. We demonstrate the power of our proposed system to data of graspable and ungraspable objects. The results show that our system has superior performance than many existing algorithms both in terms of classification accuracy and computation efficiency. Moreover, our system achieves a quite high accuracy (about 82%) in unstructured real-world condition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lenz, I., Lee, H., Saxena, A.: Deep learning for detecting robotic grasps. Int. J. Robot. Res. 34(4–5), 705–724 (2015)
Redmon, J., Angelova, A.: Real-time grasp detection using convolutional neural networks 2015, pp. 1316–1322 (2015)
Wang, Z., Li, Z., Wang, B., Liu, H.: Robot grasp detection using multimodal deep convolutional neural networks. Adv. Mech. Eng. 8(9) (2016). doi:10.1177/1687814016668077
Girshick, R.: Fast R-CNN. In: International Conference on Computer Vision (ICCV) (2015)
Bo, L., Lai, K., Ren, X., Fox, D.: Object recognition with hierarchical kernel descriptors. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1729–1736. IEEE (2011)
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128–3137 (2015)
Saxena, A., Driemeyer, J., Ng, A.Y.: Robotic grasping of novel objects using vision. Int. J. Robot. Res. 27(2), 157–173 (2008)
Levine, S., Pastor, P., Krizhevsky, A., Quillen, D.: Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection (2016)
Bohg, J., Morales, A., Asfour, T., Kragic, D.: Data-driven grasp synthesis a survey. IEEE Trans. Rob. 30(2), 289–309 (2014)
Pinto, L., Gupta, A.: Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours (2015)
Dai, J., He, K., Sun, J.: Instance-aware semantic segmentation via multi-task network cascades, pp. 3150–3158 (2016)
Wang, K., Lin, L., Zuo, W., Gu, S., Zhang, L.: Dictionary pair classifier driven convolutional neural networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2138–2146 (2016)
Bo, L., Ren, X., Fox, D.: Kernel descriptors for visual recognition. In: Advances in Neural Information Processing Systems, pp. 244–252 (2010)
Schölkopf, B., Smola, A., Müller, K.-R.: Kernel principal component analysis. In: Gerstner, W., Germond, A., Hasler, M., Nicoud, J.-D. (eds.) ICANN 1997. LNCS, vol. 1327, pp. 583–588. Springer, Heidelberg (1997). doi:10.1007/BFb0020217
Wang, Q.: Kernel principal component analysis and its applications in face recognition and active shape models (2012). arXiv preprint arXiv:1207.3538
Dauphin, Y., De Vries, H., Chung, J., Bengio, Y.: RMSprop and equilibrated adaptive learning rates for non-convex optimization. arxiv preprint (2015). arXiv preprint arXiv:1502.04390
Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view RGB-D object dataset. In: 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 1817–1824. IEEE (2011)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Wang, Z., Li, Z., Wang, B., Liu, H. (2017). Graspable Object Classification with Multi-loss Hierarchical Representations. In: Huang, Y., Wu, H., Liu, H., Yin, Z. (eds) Intelligent Robotics and Applications. ICIRA 2017. Lecture Notes in Computer Science(), vol 10464. Springer, Cham. https://doi.org/10.1007/978-3-319-65298-6_42
Download citation
DOI: https://doi.org/10.1007/978-3-319-65298-6_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-65297-9
Online ISBN: 978-3-319-65298-6
eBook Packages: Computer ScienceComputer Science (R0)