Estimating the Lecturer’s Head Pose in Seminar Scenarios – A Multi-view Approach

Voit, Michael; Nickel, Kai; Stiefelhagen, Rainer

doi:10.1007/11677482_20

Estimating the Lecturer’s Head Pose in Seminar Scenarios – A Multi-view Approach

Michael Voit¹⁸,
Kai Nickel¹⁸ &
Rainer Stiefelhagen¹⁸

Conference paper

1963 Accesses
4 Citations
3 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3869))

Abstract

In this paper, we present a system to track the horizontal head orientation of a lecturer in a smart seminar room, which is equipped with several cameras. We automatically detect and track the face of the lecturer and use neural networks to classify his or her face orientation in each camera view. By combining the single estimates of the speaker’s head orientation from multiple cameras into one joint hypothesis, we improve overall head pose estimation accuracy. We conducted experiments on annotated recordings from real seminars. Using the proposed fully automatic system we are able to correctly determine the lecturer’s head pose in 59% of the time and for 8 orientation classes. In 92% of the time, the correct pose class or a neighbouring pose class (i.e. a 45 degree error) were estimated.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ba, S.O., Obodez, J.-M.: A probabilistic framework for joint head tracking and pose estimation. In: Proceedings of the 17th International Conference on Pattern Recognition (2004)
Google Scholar
Gee, A.H., Cipolla, R.: Non-intrusive gaze tracking for human-computer interaction. In: Proceedings of Mechatronics and Machine Vision in Practise, pp. 112–117 (1994)
Google Scholar
Horprasert, T., Yacoob, Y., Davis, L.S.: Computing 3-d head orientation from a monocular image sequence. In: Proceedings of the 2nd International Conference on Automatic Face and Gesture Recognition (1996)
Google Scholar
Lienhart, R., Maydt, J.: An extended set of haar-like features for rapid object detection. In: Proceedings of the IEEE International Conference on Image Processing (2002)
Google Scholar
Nickel, K., Gehrig, T., Stiefelhagen, R., McDonough, J.: A joint particle filter for audio-visual speaker tracking. In: International Conference on Multimodal Interfaces ICMI 2005, Trento, Italy (2005)
Google Scholar
Stauffer, C., Grimson, W.: Adaptive background mixture models for real-time tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 246–252 (1999)
Google Scholar
Stiefelhagen, R., Yang, J., Waibel, A.: A modelbased gaze tracking system. In: Proceedings of the IEEE International Joint Symposia on Intelligence and Systems, pp. 304–310 (1996)
Google Scholar
Stiefelhagen, R., Yang, J., Waibel, A.: Simultaneous tracking of head poses in a panoramic view. In: International Conference on Pattern Recognition (2000)
Google Scholar
Tian, Y.-L., Brown, L., Connell, J., Pankanti, S., Hampapur, A., Senior, A., Bolle, R.: Absolute head pose estimation from overhead wide-angle cameras. In: IEEE International Workshop on Analysis and Modeling of Faces and Gestures (2003)
Google Scholar
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2001)
Google Scholar
Voit, M., Nickel, K., Stiefelhagen, R.: Multi-view head pose estimation using neural networks. In: Second Workshop on Face Processing in Video (FPiV 2005); in Proceedings of Second Canadian Conference on Computer and Robot Vision (CRV 2005), May 9-11, 2005, Victoria, BC, Canada (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Interactive Systems Lab, Universität Karlsruhe (TH), Germany
Michael Voit, Kai Nickel & Rainer Stiefelhagen

Authors

Michael Voit
View author publications
You can also search for this author in PubMed Google Scholar
Kai Nickel
View author publications
You can also search for this author in PubMed Google Scholar
Rainer Stiefelhagen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Edinburgh, Edinburgh, Scotland
Steve Renals
IDIAP Research Institute, Martigny, Switzerland
Samy Bengio

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Voit, M., Nickel, K., Stiefelhagen, R. (2006). Estimating the Lecturer’s Head Pose in Seminar Scenarios – A Multi-view Approach. In: Renals, S., Bengio, S. (eds) Machine Learning for Multimodal Interaction. MLMI 2005. Lecture Notes in Computer Science, vol 3869. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11677482_20

Download citation

DOI: https://doi.org/10.1007/11677482_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32549-9
Online ISBN: 978-3-540-32550-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics