Distributed Vision Networks for Human Pose Analysis

  • Hamid Aghajan
  • Chen Wu
  • Richard Kleihorst

Multi-camera networks offer potentials for a variety of novel human-centric applications through provisioning of rich visual information. Local processing of acquired video at the source camera facilitates operation of scalable vision networks by avoiding transfer of raw images. Additional motivation for distributed processing stems from an effort to preserve privacy of the network users while offering services in applications such as assisted living. Yet another benefit of processing the images at the source is the flexibility it offers on the type of features and the level of data exchange between the cameras in a collaborative processing framework. In such a framework data fusion can occur across the three dimensions of 3D space (multiple views), time, and feature levels.

In this chapter collaborative processing and data fusion mechanisms are examined in the context of a human pose estimation framework. For efficient collaboration between the cameras under a low-bandwidth communication constraint, only concise descriptions of extracted features instead of raw images are communicated. A 3D human body model is employed as the convergence point of the spatiotemporal and feature fusion. The model also serves as a bridge between the vision network and the high-level reasoning module, which can extract gestures and interpret them against the user's context and behavior models to arrive at system-level decisions. The human body model also allows the cameras to interact with one another on the initialization of feature extraction parameters, or to evaluate the relative value of their derived features.


Particle Swarm Optimization Assisted Living Single Camera Fall Detection Alert Level 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aghajan, H., Augusto, J., Wu, C., McCullagh, P., Walkden, J.: Distributed vision-based accident management for assisted living. In: ICOST 2007. Nara, JapanGoogle Scholar
  2. 2.
    Gottfried, B., Guesgen, H.W., Hübner, S.: Designing Smart Homes, chap. Spatiotemporal Reasoning for Smart Homes, pp. 16–34. Springer, Berlin Heidelberg New York (2006)Google Scholar
  3. 3.
    Cheung, K.M., Baker, S., Kanade, T.: Shape-from-silhouette across time: Part ii: Applications to human modeling and markerless motion tracking. International Journal of Computer Vision 63(3), 225–245 (2005)CrossRefGoogle Scholar
  4. 4.
    Deutscher, J., Blake, A., Reid, I.: Articulated body motion capture by annealed particle filtering. International Journal of Computer Vision II, 126–133 (2000)Google Scholar
  5. 5.
    Dimitrijevic, M., Lepetit, V., Fua, P.: Human body pose detection using bayesian spatio-temporal templates. Computer Vision and Image Understanding 104(2), 127–139 (2006). DOI http://dx.doi.org/10.1016/j.cviu.2006.07.007
  6. 6.
    Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient matching of pictorial structures. In: CVPR (2000)Google Scholar
  7. 7.
    Ye, G. Corso, J.J., Hager, G.D.: Real-Time Vision for Human-Computer Interaction, chap. 7: Visual Modeling of Dynamic Gestures Using 3D Appearance and Motion Features, pp. 103–120. Springer, Berlin Heidelberg New York (2005). URL gyeHCI2005.pdf
  8. 8.
    Gavrila, D.M., Davis, L.S.: 3-D model-based tracking of humans in action: A multi-view approach. In: CVPR (1996)Google Scholar
  9. 9.
    Hilton, A., Beresford, D., Gentils, T., Smith, R., Sun, W., Illingworth, J.: Whole-body modelling of people from multi-view images to populate virtual worlds. Visual Computer International Journal of Computer Graphics 16(7), 411–436 (2000)MATHCrossRefGoogle Scholar
  10. 10.
    Ivecovic, S., Trucco, E.: Human body pose estimation with pso. In: IEEE Congress on Evolutionary Computation, pp. 1256–1263 (2006)Google Scholar
  11. 11.
    Kwolek, B.: Visual system for tracking and interpreting selected human actions. In: WSCG (2003)Google Scholar
  12. 12.
    Liu, Y., Collins, R., Tsin, Y.: Gait sequence analysis using frieze patterns. In: Proceedings of the 7th European Conference on Computer Vision (ECCV’02) (2002)Google Scholar
  13. 13.
    Ménier, C., Boyer, E., Raffin, B.: 3d skeleton-based body pose recovery. In: Proceedings of the 3rd International Symposium on 3D Data Processing, Visualization and Transmission, Chapel Hill, USA (2006). URL http://perception.inrialpes.fr/Publications/2006/MBR06
  14. 14.
    Mikic, I., Trivedi, M., Hunter, E., Cosman, P.: Human body model acquisition and tracking using voxel data. International Journal of Computer Vision 53(3), 199–223 (2003). DOI http://dx.doi.org/10.1023/A:1023012723347
  15. 15.
    Muendermann, L., Corazza, S., Andriacchi, T.: The evolution of methods for the capture of human movement leading to markerless motion capture for biomechanical applications. Journal of NeuroEngineering and Rehabilitation 3(1) (2006)Google Scholar
  16. 16.
    Patil, R., Rybski, P.E., Kanade, T., Veloso, M.M.: People detection and tracking in high resolution panoramic video mosaic. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), vol. 1, pp. 1323–1328 (2004)Google Scholar
  17. 17.
    Robertson, C., Trucco, E.: Human body posture via hierarchical evolutionary optimization. In: BMVC’06, p. III:999 (2006)Google Scholar
  18. 18.
    Robertson, C., Trucco, E.: Human body posture via hierarchical evolutionary optimization. In: BMVC’06, p. III:999 (2006)Google Scholar
  19. 19.
    Rui, Y., Anandan, P.: Segmenting visual actions based on spatio-temporal motion patterns. pp. I:111–118 (2000)Google Scholar
  20. 20.
    Sidenbladh, H., Black, M.: Learning the statistics of people in images and video 54(1–3), 183–209 (2003)MATHGoogle Scholar
  21. 21.
    Sidenbladh, H., Black, M.J., Fleet, D.J.: Stochastic tracking of 3d human figures using 2d image motion. In: ECCV ’00: Proceedings of the 6th European Conference on Computer Vision-Part II, pp. 702–718. Springer, Berlin, Heidelberg, New York (2000)Google Scholar
  22. 22.
    Sidenbladh, H., Black, M.J., Sigal, L.: Implicit probabilistic models of human motion for synthesis and tracking. In: ECCV ’02: Proceedings of the 7th European Conference on Computer Vision-Part I, pp. 784–800. Springer, London, UK (2002)Google Scholar
  23. 23.
    Sigal, L., Bhatia, S., Roth, S., Black, M.J., Isard, M.: Tracking loose-limbed people. In: CVPR (2004)Google Scholar
  24. 24.
    Sigal, L., Black, M.J.: Predicting 3d people from 2d pictures. In: IV Conference on Articulated Motion and Deformable Objects (2006)Google Scholar
  25. 25.
    Sminchisescu, C., Triggs, B.: Kinematic jump processes for monocular 3d human tracking. In: CVPR (2003)Google Scholar
  26. 26.
    Tabar, A.M., Keshavarz, A., Aghajan, H.: Smart home care network using sensor fusion and distributed vision-based reasoning. In: ACM Multimedia Workshop on VSSN (2006)Google Scholar
  27. 27.
    Weiss, Y., Adelson, E.: Perceptually organized em: A framework for motion segmentaiton that combines information about form and motion. Technical Report 315, M.I.T Media Lab (1995). URL citeseer.ist.psu.edu/article/weiss95perceptually.html
  28. 28.
    Wilson, A.D., Bobick, A.F.: Parametric hidden markov models for gesture recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 21(9), 884–900 (1999). URL citeseer.ist.psu.edu/wilson99parametric.html Google Scholar
  29. 29.
    Wu, C., Aghajan, H.: Layered and collaborative gesture analysis in multi-camera networks. In: ICASSP (2007)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Hamid Aghajan
    • 1
  • Chen Wu
    • 1
  • Richard Kleihorst
    • 2
  1. 1.Department of Electrical EngineeringStanford UniversityStanfordUSA
  2. 2.NXP Semiconductor ResearchNetherlands

Personalised recommendations