Skip to main content

Applications to Safe Human–Robot Interaction

  • Chapter
3D Computer Vision

Part of the book series: X.media.publishing ((XMEDIAPUBL))

  • 4562 Accesses

Abstract

In this chapter we address the scenario of safe human–robot interaction in the industrial production environment. For example, in car manufacturing, industrial production processes are characterised by either fully automatic production sequences carried out solely by industrial robots or fully manual assembly steps where only humans work together on the same task. Close collaboration between humans and industrial robots is very limited and usually not possible due to safety concerns. Industrial production processes may increase their efficiency by establishing a close collaboration of humans and machines exploiting their unique capabilities, which requires sophisticated techniques for human–robot interaction. In this context, the recognition of interactions between humans and industrial robots requires vision methods for three-dimensional pose estimation and tracking of the motion of human body parts based on three-dimensional scene analysis. We begin with an overview of gesture recognition methods in the general context of human–robot interaction and provide an overview of vision-based safe human–robot interaction. We evaluate the performance of the three-dimensional approach to the detection and tracking of objects in point clouds described in Chap. 1 in a typical industrial production environment. The introduced methods for three-dimensional detection, pose estimation, and tracking of human body parts and recognition of their actions are evaluated in similar scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 89.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The image sequences and ground truth data are accessible at http://aiweb.techfak.uni-bielefeld.de/content/hand-forearm-limb-data-set.

References

  • Baerveldt, A.-J., 1992. A safety system for close interaction between man and robot. Proc. IFAC Int. Conf. on Safety, Security and Reliability of Computers, Zurich, Switzerland.

    Google Scholar 

  • Barrois, B., 2010. Analyse der Position, Orientierung und Bewegung von rigiden und artikulierten Objekten aus Stereobildsequenzen. Doctoral Dissertation, Technical Faculty, Bielefeld University, Germany.

    Google Scholar 

  • Barrois, B., Wöhler, C., 2008. Spatio-temporal 3D pose estimation of objects in stereo images. In: Gasteratos, A., Vincze, M., Tsotsos, J. (eds.), Proc. Int. Conf. on Computer Vision Systems, Santorini, Greece. Lecture Notes in Computer Science 5008, pp. 507–516, Springer, Berlin.

    Chapter  Google Scholar 

  • Bauckhage, C., Hanheide, M., Wrede, S., Käster, T., Pfeiffer, M., Sagerer, G., 2005. Vision systems with the human in the loop. EURASIP J. Appl. Signal Process. 2005(14), pp. 2375–2390.

    Article  MATH  Google Scholar 

  • Black, M. J., Jepson, A. D., 1998. A probabilistic framework for matching temporal trajectories: CONDENSATION-based recognition of gestures and expressions. Proc. Europ. Conf. on Computer Vision, LNCS 1406, pp. 909–924, Springer, Berlin.

    Google Scholar 

  • Blake, A., Isard, M., 1998. Active Contours. Springer, London.

    Book  Google Scholar 

  • Campbell, L. W., Becker, D. A., Azarbayejani, A., Bobick, A., Pentland, A., 1996. Invariant features for 3-D gesture recognition. Proc. Int. Workshop on Face and Gesture Recognition, Killington, USA, pp. 157–162.

    Google Scholar 

  • d’Angelo, P., Wöhler, C., Krüger, L., 2004. Model based multi-view active contours for quality inspection. Proc. Int. Conf. on Computer Vision and Graphics, Warszaw, Poland.

    Google Scholar 

  • Ebert, D., Henrich, D., 2003. SIMERO: Sichere Mensch-Roboter-Koexistenz. Proc. Workshop für OTS-Systeme in der Robotik – Mensch und Roboter ohne trennende Schutzsysteme, Stuttgart, Germany, pp. 119–134.

    Google Scholar 

  • Fischer, M., Henrich, D., 2009. Surveillance of robots using multiple colour or depth cameras with distributed processing. Proc. ACM/IEEE Int. Conf. on Distributed Smart Cameras.

    Google Scholar 

  • Franke, U., Joos, A., 2000. Real-time stereo vision for urban traffic scene understanding. Proc. IEEE Conf. on Intelligent Vehicles, Detroit, pp. 273–278.

    Google Scholar 

  • Fritsch, J., Hofemann, N., Sagerer, G., 2004. Combining sensory and symbolic data for manipulative gesture recognition. Proc. Int. Conf. on Pattern Recognition, Cambridge, UK, vol. 3, pp. 930–933.

    Google Scholar 

  • Fusiello, A., Trucco, E., Verri, A., 2000. A compact algorithm for rectification of stereo pairs. Mach. Vis. Appl. 12, pp. 16–22.

    Article  Google Scholar 

  • Gall, J., Rosenhahn, B., Brox, T., Seidel, H.-P., 2009. Optimization and filtering for human motion capture—a multi-layer framework. Int. J. Comput. Vis. 87(1–2), pp. 75–92.

    Google Scholar 

  • Gecks, T., Henrich, D., 2005. Human–robot cooperation: safe pick-and-place operations. Proc. IEEE Int. Workshop on Robot and Human Interactive Communication, Nashville, USA.

    Google Scholar 

  • Groß, H.-M., Richarz, J., Mueller, S., Scheidig, A., Martin, C., 2006. Probabilistic multi-modal people tracker and monocular pointing pose estimator for visual instruction of mobile robot assistants. Proc. IEEE World Congress on Computational Intelligence and Int. Conf. on Neural Networks, pp. 8325–8333.

    Google Scholar 

  • Hahn, M., 2011. Raum-zeitliche Objekt- und Aktionserkennung: Ein statistischer Ansatz für reale Umgebungen. Doctoral Dissertation, Technical Faculty, Bielefeld University, Germany.

    Google Scholar 

  • Hahn, M., Barrois, B., Krüger, L., Wöhler, C., Sagerer, G., Kummert, F., 2010a. 3D pose estimation and motion analysis of the articulated human hand-forearm limb in an industrial production environment. 3D Research 03, 03.

    Google Scholar 

  • Hahn, M., Krüger, L., Wöhler, C., Groß, H.-M., 2007. Tracking of human body parts using the multiocular contracting curve density algorithm. Proc. Int. Conf. on 3-D Digital Imaging and Modeling, Montréal, Canada.

    Google Scholar 

  • Hahn, M., Krüger, L., Wöhler, C., 2008a. 3D action recognition and long-term prediction of human motion. In: Gasteratos, A., Vincze, M., Tsotsos, J. (eds.), Proc. Int. Conf. on Computer Vision Systems, Santorini, Greece. Lecture Notes in Computer Science 5008, pp. 23–32, Springer, Berlin.

    Chapter  Google Scholar 

  • Hahn, M., Krüger, L., Wöhler, C., Kummert, F., 2009. 3D action recognition in an industrial environment. In: Ritter, H., Sagerer, G., Dillmann, R., Buss, M. (eds.), Proc. 3rd Int. Workshop on Human-Centered Robot Systems, Bielefeld, Germany. Cognitive Systems Monographs 6, pp. 141–150, Springer, Berlin.

    Google Scholar 

  • Hahn, M., Quronfuleh, F., Wöhler, C., Kummert, F., 2010b. 3D mean-shift tracking and recognition of working actions. In: Salah, A. A., Gevers, T., Sebe, N., Vinciarelli, A. (eds.), Proc. Int. Workshop on Human Behaviour Understanding, held in conjunction with ICPR 2010, Istanbul, Turkey. Lecture Notes on Computer Science 6219, pp. 101–112, Springer, Berlin.

    Chapter  Google Scholar 

  • Hanek, R., 2004. Fitting Parametric Curve Models to Images Using Local Self-adapting Separation Criteria. Doctoral Dissertation, Technical University of Munich.

    Google Scholar 

  • Henrich, D., Fischer, M., Gecks, T., Kuhn, S., 2008. Sichere Mensch/Roboter-Koexistenz und Kooperation. Proc. Robotik 2008, München, Germany.

    Google Scholar 

  • Henrich, D., Gecks, T., 2008. Multi-camera collision detection between known and unknown objects. Proc. ACM/IEEE International Conference on Distributed Smart Cameras.

    Google Scholar 

  • Hofemann, N., 2007. Videobasierte Handlungserkennung für die natürliche Mensch-Maschine-Interaktion. Doctoral Dissertation, Technical Faculty, Bielefeld University, Germany.

    Google Scholar 

  • Hofmann, M., Gavrila, D. M., 2009. Multi-view 3D human pose estimation combining single-frame recovery, temporal integration and model adaptation. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 2214–2221.

    Google Scholar 

  • Huguet, F., Devernay, F., 2007. A variational method for scene flow estimation from stereo sequences. Proc. Int. Conf. on Computer Vision, pp. 1–7.

    Google Scholar 

  • Krüger, L., Wöhler, C., Würz-Wessel, A., Stein, F., 2004. In-factory calibration of multiocular camera systems. Proc. SPIE Photonics Europe (Optical Metrology in Production Engineering), Strasbourg, pp. 126–137.

    Google Scholar 

  • Krüger, L., Wöhler, C., 2011. Accurate chequerboard corner localisation for camera calibration. Pattern Recognit. Lett. 32, pp. 1428–1435.

    Article  Google Scholar 

  • Kuhn, S., Gecks, T., Henrich, D., 2006. Velocity control for safe robot guidance based on fused vision and force/torque data. Proc. IEEE Conf. on Multisensor Fusion and Integration for Intelligent Systems, Heidelberg, Germany.

    Google Scholar 

  • Li, Z., Fritsch, J., Wachsmuth, S., Sagerer, G., 2006. An object-oriented approach using a top-down and bottom-up process for manipulative action recognition. In: Franke, K., Müller, K.-R., Nickolay, B., Schäfer, R. (eds.), Pattern Recognition, Proc. 28th DAGM Symposium, Heidelberg, Germany. Lecture Notes in Computer Science 4174, pp. 212–221, Springer, Berlin.

    Google Scholar 

  • Mündermann, L., Corazza, S., Andriacchi, T. P., 2008. Markerless motion capture for biomechanical applications. In: Rosenhahn, B., Klette, R., Metaxas, D. (eds.), Human Motion: Understanding, Modelling, Capture and Animation, Springer, Dordrecht.

    Google Scholar 

  • Nehaniv, C. L., 2005: Classifying types of gesture and inferring intent. Proc. Symp. on Robot Companions: Hard Problems and Open Challenges in Robot–Human Interaction, pp. 74–81. The Society for the Study of Artificial Intelligence and the Simulation of Behaviour.

    Google Scholar 

  • Nickel, K., Seemann, E., Stiefelhagen, R., 2004. 3D-tracking of head and hands for pointing gesture recognition in a human–robot interaction scenario. Proc. IEEE Int. Conf. on Automatic Face and Gesture Recognition, Seoul, Korea, pp. 565–570.

    Chapter  Google Scholar 

  • Nickel, K., Stiefelhagen, R., 2004. Real-time person tracking and pointing gesture recognition for human–robot interaction. Proc. Europ. Conf. on Computer Vision, Workshop on HCI, Prague, Czech Republic. Lecture Notes in Computer Science 3058, pp. 28–38, Springer, Berlin.

    Google Scholar 

  • Pavlovic, V., Sharma, R., Huang, T. S., 1997. Visual interpretation of hand gestures for human–computer interaction: a review. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), pp. 677–695.

    Article  Google Scholar 

  • Poppe, R., 2010. A survey on vision-based human action recognition. Image Vis. Comput. 28, pp. 976–990.

    Article  Google Scholar 

  • Richarz, J., Fink, G. A., 2001. Visual recognition of 3D emblematic gestures in an HMM framework. J. Ambient Intell. Smart Environ. 3(3), pp. 193–211. Thematic Issue on Computer Vision for Ambient Intelligence.

    Google Scholar 

  • Rosenhahn, B., Kersting, U., Smith, A., Gurney, J., Brox, T., Klette, R., 2005. A system for marker-less human motion estimation. In: Kropatsch, W., Sablatnig, R., Hanbury, A. (eds.), Pattern Recognition, Proc. 27th DAGM Symposium, Vienna, Austria. Lecture Notes in Computer Science 3663, pp. 230–237, Springer, Berlin.

    Google Scholar 

  • Rosenhahn, B., Kersting, U. G., Powell, K., Brox, T., Seidel, H.-P., 2008a. Tracking Clothed People. In: Rosenhahn, B., Klette, R., Metaxas, D. (eds.), Human Motion: Understanding, Modelling, Capture and Animation, Springer, Dordrecht.

    Google Scholar 

  • Rosenhahn, B., Schmaltz, C., Brox, T., Weickert, J., Cremers, D., Seidel, H.-P., 2008b. Markerless motion capture of man–machine interaction. Proc. IEEE Conf. on Computer Vision and Pattern Recognition.

    Google Scholar 

  • Schmidt, J., 2009. Monokulare Modellbasierte Posturschätzung des Menschlichen Oberkörpers. Proc. Oldenburger 3D-Tage, Oldenburg, Germany, pp. 270–280.

    Google Scholar 

  • Schmidt, J., Fritsch, J., Kwolek, B., 2006. Kernel particle filter for real-time 3d body tracking in monocular color images. Proc. IEEE Int. Conf. on Automatic Face and Gesture Recognition, pp. 567–572.

    Google Scholar 

  • Schmidt, J., Wöhler, C., Krüger, L., Gövert, T., Hermes, C., 2007. 3D scene segmentation and object tracking in multiocular image sequences. Proc. Int. Conf. on Computer Vision Systems, Bielefeld, Germany.

    Google Scholar 

  • Schweitzer, G., 1993. High-performance applications: robot motions in complex environments. Control Eng. Pract. 1(3), pp. 499–504.

    Article  Google Scholar 

  • Sigal, L., Black, M. J., 2006. Human Eva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion. Technical Report CS-06-08, Brown University.

    Google Scholar 

  • Turk, M., 2005. Multimodal human computer interaction. In: Kisacanin, B., Pavlovic, V., Huang, T. S. (eds.), Real-Time Vision for Human–Computer Interaction, Springer, Berlin, pp. 269–283.

    Chapter  Google Scholar 

  • Viola, P. A., Jones, M. J., 2001. Rapid object detection using a boosted cascade of simple features. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 511–518.

    Google Scholar 

  • Vischer, D., 1992. Cooperating robot with visual and tactile skills. Proc. IEEE Int. Conf. on Robotics md Automation, pp. 2018–2025.

    Chapter  Google Scholar 

  • Wachsmuth, S., Wrede, S., Hanheide, M., Bauckhage, C., 2005. An active memory model for cognitive computer vision systems. KI Journal 19(2), pp. 25–31. Special Issue on Cognitive Systems.

    Google Scholar 

  • Wedel, A., Rabe, C., Vaudrey, T., Brox, T., Franke, U., Cremers, D., 2008a. Efficient dense scene flow from sparse or dense stereo data. Proc. Europ. Conf. on Computer Vision, pp. 739–751.

    Google Scholar 

  • Wedel, A., Brox, T., Vaudrey, T., Rabe, C., Franke, U., Cremers, D., 2011. Stereoscopic scene flow computation for 3D motion understanding. Int. J. Comput. Vis. 95, pp. 29–51.

    Article  MATH  Google Scholar 

  • Winkler, K. (ed.), 2006. Three Eyes Are Better than Two. SafetyEYE uses technical image processing to protect people at their workplaces. DaimlerChrysler Hightech Report 12/2006, DaimlerChrysler AG Communications, Stuttgart, Germany.

    Google Scholar 

  • Ziegler, J., Nickel, K., Stiefelhagen, R., 2006. Tracking of the articulated upper body on multi-view stereo image sequences. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 1, pp. 774–781.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag London

About this chapter

Cite this chapter

Wöhler, C. (2013). Applications to Safe Human–Robot Interaction. In: 3D Computer Vision. X.media.publishing. Springer, London. https://doi.org/10.1007/978-1-4471-4150-1_7

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-4150-1_7

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-4149-5

  • Online ISBN: 978-1-4471-4150-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics