Multi-view 4D Reconstruction of Human Action for Entertainment Applications


Multi-view 4D reconstruction of human action has a number of applications in entertainment. This chapter describes a selection of application areas that are of interest to the broadcast, movie and gaming industries. In particular, free-viewpoint video techniques for special effects and sport post-match analysis are discussed. The appearance of human action is captured as 4D data represented by 3D volumetric or surface data over time. A review of recent approaches identifies two major classes: 4D reconstruction and model-based tracking. The second part of the chapter describes aspects of a practical implementation of a 4D reconstruction pipeline. Implementations of the popular visual hull are discussed, as a building block in many free-viewpoint video systems.


Camera Calibration Foreground Object Camera Parameter Stereo Match Visual Hull 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Time slice films.
  2. 2.
    Bai, X., Wang, J., Simons, D., Sapiro, G.: Video SnapCut: Robust video object cutout using localized classifiers. In: ACM SIGGRAPH 2009 papers, pp. 1–11. ACM, New York (2009) CrossRefGoogle Scholar
  3. 3.
    Cagniart, C., Boyer, E., Ilic, S.: Probabilistic deformable surface tracking from multiple videos. In: ECCV 2010, pp. 326–339 (2010) CrossRefGoogle Scholar
  4. 4.
    Carranza, J., Theobalt, C., Magnor, M., Seidel, H.-P.: Free-viewpoint video of human actors. ACM Trans. Graph. 22(3), 569–577 (2003) CrossRefGoogle Scholar
  5. 5.
    Chuang, Y.-Y., Curless, B., Salesin, D.H., Szeliski, R.: A Bayesian approach to digital matting. In: Proceedings of IEEE CVPR 2001, vol. 2, pp. 264–271. IEEE Comput. Soc., Los Alamitos (December 2001) Google Scholar
  6. 6.
    Easterbrook, J., Grau, O., Schübel, P.: A system for distributed multi-camera capture and processing. In: Proc. of CVMP (2010) Google Scholar
  7. 7.
    Furukawa, Y., Ponce, J.: Carved visual hulls for image-based modeling. Int. J. Comput. Vis. 81(1), 53–67 (2009) CrossRefGoogle Scholar
  8. 8.
    Grau, O.: 3D sequence generation from multiple cameras. In: Proc. of IEEE, International Workshop on Multimedia Signal Processing 2004, Siena, Italy (September 2004) Google Scholar
  9. 9.
    Grau, O.: A 3D production pipeline for special effects in TV and film. In: Mirage 2005, Computer Vision/Computer Graphics Collaboration Techniques and Applications, Rocquencourt, France. INRIA, Rocquencourt (March 2005) Google Scholar
  10. 10.
    Grau, O., Easterbrook, J.: Effects of camera aperture correction on keying of broadcast video. In: Proc. of the 5th European Conference on Visual Media Production (CVMP) (2008) Google Scholar
  11. 11.
    Grau, O., Pullen, T., Thomas, G.A.: A combined studio production system for 3-d capturing of live action and immersive actor feedback. IEEE Trans. Circuits Syst. Video Technol. 14(3), 370–380 (2004) CrossRefGoogle Scholar
  12. 12.
    Grau, O., Thomas, G.A., Hilton, A., Kilner, J., Starck, J.: A robust free-viewpoint video system for sport scenes. In: Proc. of 3DTV-Conference, Kos island, Greece (May 2007) Google Scholar
  13. 13.
    Guillemaut, J.Y., Kilner, J., Hilton, A.: Robust graph-cut scene segmentation and reconstruction for free-viewpoint video of complex dynamic scenes. In: Computer Vision, 2009 IEEE 12th International Conference on, pp. 809–816. IEEE Comput. Soc., Los Alamitos (2010) Google Scholar
  14. 14.
    Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2000) MATHGoogle Scholar
  15. 15.
    Hernández Esteban, C., Schmitt, F.: Silhouette and stereo fusion for 3D object modeling. Comput. Vis. Image Underst. 96(3), 367–392 (2004) CrossRefGoogle Scholar
  16. 16.
    Hillman, P., Hannah, J., and Renshaw, D.. Foreground/background segmentation of motion picture images and image sequences. IEE Proc., Vis. Image Signal Process. 152(4), 387–397 (2005) CrossRefGoogle Scholar
  17. 17.
    Kanade, T., et al.: Eyevision at super bowl XXXV. Web (2001) Google Scholar
  18. 18.
    Kappei, F., Liedtke, C.-E.: Ein verfahren zur modellierung von 3d-objekten aus fernsehbildfolgen. In: Mustererkennung 1987, 9. DAGM-Symposium, pp. 277–281 (1987) Google Scholar
  19. 19.
    Koch, R.: Model-based 3-d scene analysis from stereoscopic image sequences. ISPRS J. Photogramm. Remote Sens. 49(5), 23–30 (1994) CrossRefGoogle Scholar
  20. 20.
    Koch, R.. Dynamic 3-d scene analysis through synthesis feedback control. IEEE Trans. Pattern Anal. Mach. Intell. 15(6), 556–568 (1993) CrossRefGoogle Scholar
  21. 21.
    Laurentini, A.: The visual hull concept for silhouette-based image understanding. IEEE Trans. Pattern Anal. Mach. Intell. 16(2), 150–162 (1994) CrossRefGoogle Scholar
  22. 22.
    Levin, A., Lischinski, D., Weiss, Y.: A closed-form solution to natural image matting. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 228–242 (2007) CrossRefGoogle Scholar
  23. 23.
    Matsuyama, T., Wu, X., Takai, T., Wada, T.: Real-time dynamic 3-d object shape reconstruction and high-fidelity texture mapping for 3-d video. IEEE Trans. Circuits Syst. Video Technol. 14(3), 357–369 (2004) CrossRefGoogle Scholar
  24. 24.
    Matusik, W., Buehler, C., McMillan, L.: Polyhedral visual hulls for real-time rendering. In: Proc. of 12th Eurographics Workshop on Rendering, pp. 116–126 (2001) Google Scholar
  25. 25.
    Matusik, W., Buehler, C., Raskar, R., Gortler, S.J., McMillan, L.: Image-based visual hulls. In: Akeley, K. (ed.) Siggraph 2000, Computer Graphics Proceedings, pp. 369–374. ACM Press, New York (2000) Google Scholar
  26. 26.
    Okutomi, M., Kanade, T.: A multiple-baseline stereo. IEEE Trans. Pattern Anal. Mach. Intell. 15(4), 353–363 (1993) CrossRefGoogle Scholar
  27. 27.
    Potmesil, M.: Generating octree models of 3D objects from their silhouettes in a sequence of images. Comput. Vis. Graph. Image Process. 40, 1–29 (1987) CrossRefGoogle Scholar
  28. 28.
    Rander, P., Narayanan, P.J., Kanade, T.: Virtualized reality: Constructing time-varying virtual worlds from real world events. In: IEEE Visualization, pp. 277–284 (1997) Google Scholar
  29. 29.
    Roble, D., Zafar, N.B.: Don’t trust your eyes: Cutting-edge visual effects. vol. 42, pp. 35–41 (2009) Google Scholar
  30. 30.
    Rother, C., Kolmogorov, V., Blake, A.: “GrabCut”: Interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23(3), 309–314 (2004) CrossRefGoogle Scholar
  31. 31.
    Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47(1), 7–42 (2002) MATHCrossRefGoogle Scholar
  32. 32.
    Seitz, S.M., Curless, B., Diebel, J., Scharstein, D., Szeliski, R.: A comparison and evaluation of multi-view stereo reconstruction algorithms. In: Computer Vision and Pattern Recognition, 2006 IEEE Comput. Soc. Conference on, vol. 1, pp. 519–528 (2006) Google Scholar
  33. 33.
    Seitz, S.M., Dyer, C.R.: View morphing. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, pp. 21–30. ACM, New York (1996) Google Scholar
  34. 34.
    Shum, H.-Y., Kang, S.B., Chan, S.-C.: Survey of image-based representations and compression techniques. IEEE Trans. Circuits Syst. Video Technol. 13(11), 1020–1037 (2003) CrossRefGoogle Scholar
  35. 35.
    Smith, A.R., Blinn, J.F.: Blue screen matting. In: SIGGRAPH ’96: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, pp. 259–268. ACM, New York (1996) CrossRefGoogle Scholar
  36. 36.
    Starck, J., Hilton, A.: Model-based multiple view reconstruction of people. In: Proc. of ICCV, pp. 915–922 (2003) Google Scholar
  37. 37.
    Starck, J., Hilton, A.: Correspondence labelling for wide-timeframe free-form surface matching. In: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on, pp. 1–8. IEEE Comput. Soc., Los Alamitos (2007) CrossRefGoogle Scholar
  38. 38.
    Szeliski, R.: Rapid octree construction from image sequences. CVGIP, Image Underst. 58(1), 23–32 (1993) CrossRefGoogle Scholar
  39. 39.
    Franco, J.S., Boyer, E.: Exact polyhedral visual hulls. In: British Machine Vision Conference, pp. 329–338 (2003) Google Scholar
  40. 40.
    Thomas, G.A.: Real-time camera pose estimation for augmenting sports scenes. In: Proc. of 3rd European Conf. on Visual Media Production (CVMP2006), London, UK, pp. 10–19 (November 2006) Google Scholar
  41. 41.
    Thomas, G.A., Lau, H.Y.K.: Generation of high quality slow-motion replay using motion compensation. In: Proc. of International Broadcasting Convention (1990) Google Scholar
  42. 42.
    Tsai, R.: A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses. IEEE J. Robot. Autom. 3(4), 323–344 (1987) CrossRefGoogle Scholar
  43. 43.
    Vedula, S., Baker, S., Rander, P., Collins, R., Kanade, T.: Three-dimensional scene flow. IEEE Trans. Pattern Anal. Mach. Intell. 27(3), 475–480 (2005) CrossRefGoogle Scholar
  44. 44.
    Vlasic, D., Baran, I., Matusik, W., Popović, J.: Articulated mesh animation from multi-view silhouettes. In: ACM SIGGRAPH 2008 papers, pp. 1–9. ACM, New York (2008) CrossRefGoogle Scholar
  45. 45.
    Vogiatzis, G., Esteban, C.H., Torr, P.H.S., Cipolla, R.: Multiview stereo via volumetric graph-cuts and occlusion robust photo-consistency. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2241–2246 (2007) CrossRefGoogle Scholar
  46. 46.
    Weik, S., Wingbermühle, J., Niem, W.: Automatic creation of flexible antropomorphic models for 3D videoconferencing. In: Computer Graphics International, 1998. Proceedings, pp. 520–527. IEEE Comput. Soc., Los Alamitos (1998) Google Scholar
  47. 47.
    Würmlin, S., Lamboray, E., Staadt, O.G., Gross, M.H.: 3D video recorder: A system for recording and playing free-viewpoint video. Comput. Graph. Forum 22, 181–193 (2003). Wiley Online Library CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  1. 1.BBC Research & DevelopmentLondonUK

Personalised recommendations