Skip to main content

A Hierarchical System for Recognition, Tracking and Pose Estimation

  • Conference paper
Machine Learning for Multimodal Interaction (MLMI 2004)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3361))

Included in the following conference series:

Abstract

This paper presents a new system for recognition, tracking and pose estimation of people in video sequences. It is based on the wavelet transform from the upper body part and uses Support Vector Machines (SVM) for classification. Recognition is carried out hierarchically by first recognizing people and then individual characters. The characteristic features that best discriminate one person from another are learned automatically. Tracking is solved via a particle filter that utilizes the SVM output and a first order kinematic model to obtain a robust scheme that successfully handles occlusion, different poses and camera zooms. For pose estimation a collection of SVM classifiers is evaluated to detect specific, learned poses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Burl, M.C., Leung, T.K., Perona, P.: Face Localization via Shape Statistics. In: FG 1995, pp. 154–159 (1995)

    Google Scholar 

  2. Depoortere, V., et al.: Efficient pedestrian detection: a test case for SVM based categorization. In: DAGM, Cognitive Vision workshop (2002)

    Google Scholar 

  3. Evgeniou, T., Pontil, M., Papageorgiou, C., Poggio, T.: Image representations for object detection using kernel classifiers. In: ACCV 2000, pp. 687–692 (2000)

    Google Scholar 

  4. Giebel, J., Gavrila, D.M.: Multimodal shape tracking with point distribution models. In: Van Gool, L. (ed.) DAGM 2002, vol. 2449, pp. 1–8. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  5. Isard, M., Blake, A.: CONDENSATION – Conditional Density Propagation for Visual Tracking. In: IJCV 1998, vol. 1(29), pp. 5–28 (1998)

    Google Scholar 

  6. Isard, M., Blake, A.: A Mixed-state Condensation Tracker with Automatic Model-switching. In: ICCV 1998, pp. 107–112 (1998)

    Google Scholar 

  7. Isard, M., MacCormick, J.: BraMBLe: A Bayesian Multiple-Blob Tracker. In: International Conference on Computer Vision, pp. 34–41 (2001)

    Google Scholar 

  8. Nakajima, C., Pontil, M., Poggio, T.: People Recognition and Pose Estimation in Image Sequences. In: IJCNN 2000 (2000)

    Google Scholar 

  9. Oren, M., Papageorgiou, C., Sinha, P., Osuna, E., Poggio, T.: Pedestrian detection using wavelet templates. In: CVPR 1997, pp. 193–199 (1997)

    Google Scholar 

  10. Osuna, E., Freund, R., Girosi, F.: Training Support Vector Machines: an Application to Face Detection. ICCV 1997, 130–136 (1997)

    Google Scholar 

  11. Rowley, H.A., Buluja, S., Kande, T.: Neural Networks Based Face Detection. In: PAMI 1998, vol. 20(1), pp. 22–38 (1998)

    Google Scholar 

  12. Schneiderman, H., Kanade, T.: A Statistic Method for 3D Object Detection Applied to Faces and Cars. In: CVPR 2000, vol. I, pp. 746–751 (2000)

    Google Scholar 

  13. Verma, R., Schmid, C., Mikolajczyk, K.: Face Detection and Tracking in a Video By Propagating Detection Probabilities. In: PAMI 2003, vol. 25(10), pp. 1215–1227 (October 2003)

    Google Scholar 

  14. Viola, P., Jones, M.: Robust Real-time Object Detection. In: SCTV 2001 (2001)

    Google Scholar 

  15. Yang, G., Huang, T.: Human Face Detection in a Complex Background. Pattern Recognition 27, 53–63 (1994)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zehnder, P., Koller-Meier, E., Van Gool, L. (2005). A Hierarchical System for Recognition, Tracking and Pose Estimation. In: Bengio, S., Bourlard, H. (eds) Machine Learning for Multimodal Interaction. MLMI 2004. Lecture Notes in Computer Science, vol 3361. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30568-2_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30568-2_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-24509-4

  • Online ISBN: 978-3-540-30568-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics