Abstract
Estimating the position and orientation (pose) of objects in images is a crucial step toward successful robot programming by demonstration using visual task learning. Currently, a number of algorithms exist for detecting and tracking objects in images, including conventional image processing methods and the state-of-the-art methods based on deep learning architectures. However, the problem of accurate estimation of 6D poses of objects in a sequence of video frames still poses challenges. In this paper, we present a novel deep learning method for pose estimation based on data augmentation and nonlinear regression. For training purposes, thousands of images associated with views of different poses of an object are generated based on a known CAD model of the object geometry. The trained deep neural network is employed for accurate and real-time estimation of the orientation of the object. The object position coordinates in the demonstrations are obtained from the depth information of the scene captured by a Microsoft Kinect v2.0 sensor. The resulting 6-dimensional poses are estimated at each time frame and are employed for learning robotic tasks at a trajectory level of abstraction. Robot inverse kinematics is applied to generate a program for robotic task execution. The proposed method is validated for transferring new skills to a robot in a painting application.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
G. Biggs, B. MacDonald, A survey of robot programming systems, in Proceedings of the Australasian Conference on Robotics and Automation, Brisbane, Australia (2003), pp. 1–10
S. Schaal, A. Ijspeert, A. Billard, Computational approaches to motor learning by imitation. Philos. Trans. R. Soc. Lond. Biol. Sci. 358(1431), 537–547 (2003)
S. Calinon, Robot Programming by Demonstration: A Probabilistic Approach (EPFL/CRC Press, Boca Raton, USA, 2009)
A.G. Billard, S. Calinon, R. Dillmann, Learning from humans, in Handbook of Robotics, ed. by B. Siciliano, O. Khatib (Springer, New York, USA, 2016), pp. 1995–2014
B. Argall, S. Chernova, M. Veloso, B. Browning, A survey of learning from demonstration. Robot. Auton. Syst. 57(5), 469–483 (2009)
A. Vakanski, F. Janabi-Sharifi, Robot Learning from Visual Observation (Wiley, 2017)
X. Jia, H. Lu, M. Yang, Visual tracking via adaptive structural local sparse appearance model, in IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA (2012), pp. 1822–1829
D. Li, W. Chen, Object tracking with convolutional neural networks and kernelized correlation filters, in Chinese Control and Decision Conference, Chongqing, China (2017), pp. 1039–1044
J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: unified, real-time object detection, in IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA (2016), pp. 779–788
R. Dillmann, Teaching and learning of robot tasks via observation of human performance. Robot. Auton. Syst. 47(2–3), 109–116 (2004)
D. Martinez, D. Kragic, Modeling and recognition of actions through motor primitives, in Proceedings of the IEEE International Conference Robotics and Automation, Pasadena, USA (2008), pp. 1704–1709
A. Vakanski, I. Mantegh, A. Irish, F. Janabi-Sharifi, Trajectory learning for robot programming by demonstration using hidden Markov model and dynamic time warping. IEEE Trans. Syst. Man Cybern. Part B 41(4), 1039–1052 (2012)
A. Vakanski, F. Janabi-Sharifi, I. Mantegh, An image-based trajectory planning approach for robust robot programming by demonstration. Robot. Auton. Syst. 98, 241–257 (2017)
O. Faugeras, Three-Dimensional Computer Vision: A Geometric Viewpoint (MIT Press, Cambridge, USA, 1993)
P. Wohlhart, V. Lepetit, Learning descriptors for object recognition and 3D pose estimation, in IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA (2015), pp. 3109–3118
Demos available at https://youtu.be/G09J57jMzmg
Autodesk ReCap (2018). Available at https://www.autodesk.com/products/recap/overview
Autodesk 3D Max (2018). Available at https://www.autodesk.com/products/3ds-max/overview
S. Karen, A. Zisserman, Very deep convolutional networks for large-scale image recognition, arXiv:1409.1556 (2014)
Quarc Real-time Control Software (2018). Available at https://www.quanser.com/products/quarc-real-time-control-software/
Acknowledgements
This work was supported by NSERC Innovation to Idea (I2I) grant (I2I PJ 486866-15). We would like to thank Miss. Kaiqi Cheng for validating the experiments. Authors received a high-end Graphical Processing Unit (GPU), Titan XP from NVIDIA which was used for this research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ghahramani, M., Vakanski, A., Janabi-Sharifi, F. (2019). 6D Object Pose Estimation for Robot Programming by Demonstration. In: Martínez-García, A., Bhattacharya, I., Otani, Y., Tutsch, R. (eds) Progress in Optomechatronic Technologies . Springer Proceedings in Physics, vol 233. Springer, Singapore. https://doi.org/10.1007/978-981-32-9632-9_11
Download citation
DOI: https://doi.org/10.1007/978-981-32-9632-9_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-32-9631-2
Online ISBN: 978-981-32-9632-9
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)