3D Reconstruction and Video-Based Rendering of Casually Captured Videos

Taneja, Aparna; Ballan, Luca; Puwein, Jens; Brostow, Gabriel J.; Pollefeys, Marc

doi:10.1007/978-3-642-24870-2_4

Aparna Taneja²⁰,
Luca Ballan²⁰,
Jens Puwein²⁰,
Gabriel J. Brostow²¹ &
…
Marc Pollefeys²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7082))

935 Accesses
1 Citations

Abstract

In this chapter we explore the possibility of interactively navigating a collection of casually captured videos of a performance: real-world footage captured on hand held cameras by a few members of the audience. The aim is to navigate the video collection in 3D by generating video based rendering of the performance using the offline pre-computed reconstruction of the event.

We propose two different techniques to obtain this reconstruction, considering that the video collection may have been recorded in complex, uncontrolled outdoor environments. One approach recovers the event geometry by exploring the temporal domain of each video independently, while the other explores the spatial domain of the video collection at each time instant, independently. The pros and cons of the two methods and their applicability to the addressed navigation problem, are also discussed. In the end, we propose an interactive GPU-accelerated viewing tool to navigate the video collection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: Exploring photo collections in 3d. In: SIGGRAPH Conference Proceedings, pp. 835–846 (2006)
Google Scholar
Snavely, N., Garg, R., Seitz, S.M., Szeliski, R.: Finding paths through the world’s photos. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2008) 27, 11–21 (2008)
Google Scholar
Goesele, M., Ackermann, J., Fuhrmann, S., Haubold, C., Klowsky, R., Steedly, D., Szeliski, R.: Ambient point coulds for view interpolation. In: SIGGRAPH (2010)
Google Scholar
Kim, H., Sarim, M., Takai, T., Guillemaut, J.Y., Hilton, A.: Dynamic 3d scene reconstruction in outdoor environments. In: 3DPVT (2010)
Google Scholar
Guan, L., Franco, J.S., Pollefeys, M.: Multi-object shape estimation and tracking from silhouette cues. In: CVPR (2008)
Google Scholar
Franco, J.-S., Boyer, E.: Fusion of multi-view silhouette cues using a space occupancy grid. In: ICCV, pp. 1747–1753 (2005)
Google Scholar
Matusik, W., Buehler, C., Raskar, R., Gortler, S.J., McMillan, L.: Image-based visual hulls. In: Proceedings of ACM SIGGRAPH, pp. 369–374 (2000)
Google Scholar
Sarim, M., Hilton, A., Guillemaut, J.Y., Kim, H., Takai, T.: Multiple view wide-baseline trimap propagation for natural video matting. In: 2010 Conference on Visual Media Production (CVMP), pp. 82–91 (2010)
Google Scholar
Seitz, S., Curless, B., Diebel, J., Scharstein, D., Szeliski, R.: A comparison and evaluation of multi-view stereo reconstruction algorithms. In: CVPR (2006)
Google Scholar
Seitz, S.M., Dyer, C.R.: Photorealistic scene reconstruction by voxel coloring. In: CVPR, p. 1067 (1997)
Google Scholar
Furukawa, Y., Ponce, J.: Dense 3d motion capture for human faces. In: CVPR, pp. 1674–1681 (2009)
Google Scholar
Vlasic, D., Peers, P., Baran, I., Debevec, P., Popović, J., Rusinkiewicz, S., Matusik, W.: Dynamic shape capture using multi-view photometric stereo. In: SIGGRAPH Asia (2009)
Google Scholar
Ahmed, N., Theobalt, C., Dobrev, P., Seidel, H.P., Thrun, S.: Robust fusion of dynamic shape and normal capture for high-quality reconstruction of time-varying geometry. In: CVPR (2008)
Google Scholar
Hernández, C., Vogiatzis, G., Cipolla, R.: Shadows in three-source photometric stereo. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 290–303. Springer, Heidelberg (2008)
Chapter Google Scholar
Vedula, S., Baker, S., Seitz, S., Kanade, T.: Shape and motion carving in 6d. In: CVPR (2000)
Google Scholar
Goldlucke, B., Ihrke, I., Linz, C., Magnor, M.: Weighted minimal hypersurface reconstruction. PAMI, 1194–1208 (2007)
Google Scholar
Hilton, A., Starck, J.: Multiple view reconstruction of people. In: 3DPVT (2004)
Google Scholar
Sinha, S.N., Pollefeys, M.: Multi-view reconstruction using photo-consistency and exact silhouette constraints: A maximum-flow formulation. In: ICCV, pp. 349–356 (2005)
Google Scholar
Tung, T., Nobuhara, S., Matsuyama, T.: Complete multi-view reconstruction of dynamic scenes from probabilistic fusion of narrow and wide baseline stereo. In: ICCV (2009)
Google Scholar
Waschbüsch, M., Würmlin, S., Gross, M.H.: 3d video billboard clouds. Computer Graphics Forum (Proc. Eurographics EG 2007) 26, 561–569 (2007)
Article Google Scholar
Ballan, L., Cortelazzo, G.M.: Marker-less motion capture of skinned models in a four camera set-up using optical flow and silhouettes. In: 3DPVT (June 2008)
Google Scholar
Carranza, J., Theobalt, C., Magnor, M.A., Peter Seidel, H.: Free-viewpoint video of human actors. ACM Transactions on Graphics, 569–577 (2003)
Google Scholar
Vlasic, D., Baran, I., Matusik, W., Popović, J.: Articulated mesh animation from multi-view silhouettes. ACM Transactions on Graphics 27, 97:1–97:9 (2008)
Google Scholar
de Aguiar, E., Stoll, C., Theobalt, C., Ahmed, N., Seidel, H.P., Thrun, S.: Performance capture from sparse multi-view video. ACM Trans. Graph. 27, 1–10 (2008)
Article Google Scholar
Zitnick, C.L., Kang, S.B., Uyttendaele, M., Winder, S., Szeliski, R.: High-quality video view interpolation using a layered representation. ACM Transactions on Graphics 23, 600–608 (2004)
Article Google Scholar
Kanade, T.: Carnegie mellon goes to the superbowl (2001), http://www.ri.cmu.edu/events/sb35/tksuperbowl.html
Würmlin, S., Niederberger, C.: Realistic virtual replays for sports broadcasts (2010), http://www.liberovision.com/
Guillemaut, J.-Y., Kilner, J., Hilton, A.: Robust graph-cut scene segmentation and reconstruction for free-viewpoint video of complex dynamic scenes. In: ICCV (2009)
Google Scholar
Hayashi, K., Saito, H.: Synthesizing free-viewpoint images from multiple view videos in soccer stadium. In: CGIV, pp. 220–225 (2006)
Google Scholar
Hasler, N., Rosenhahn, B., Thormählen, T., Wand, M., Gall, J., Seidel, H.P.: Markerless motion capture with unsynchronized moving cameras. In: CVPR, pp. 224–231 (2009)
Google Scholar
Lipski, C., Linz, C., Berger, K., Sellent, A., Magnor, M.: Virtual video camera: Image-based viewpoint navigation through space and time. Computer Graphics Forum 29, 2555–2568 (2010)
Article Google Scholar
Eisemann, M., Decker, B.D., Magnor, M., Bekaert, P., de Aguiar, E., Ahmed, N., Theobalt, C., Sellent, A.: Floating Textures. Computer Graphics Forum (Proc. Eurographics EG 2008) 27, 409–418 (2008)
Article Google Scholar
Seitz, S.M., Dyer, C.R.: View morphing. In: Proceedings of ACM SIGGRAPH, pp. 21–30 (1996)
Google Scholar
Pollefeys, M., Van Gool, L., Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., Koch, R.: Visual modeling with a hand-held camera. IJCV 59, 207–232 (2004)
Article Google Scholar
Lhuillier, M., Quan, L.: A quasi-dense approach to surface reconstruction from uncalibrated images. IEEE Trans. Pattern Anal. Mach. Intell. 27, 418–433 (2005)
Article Google Scholar
Ballan, L., Cortelazzo, G.M.: Multimodal 3D shape recovery from texture, silhouette and shadow information. In: 3DPVT. Chapel Hill, USA (2006)
Google Scholar
Campbell, N.D., Vogiatzis, G., Hernández, C., Cipolla, R.: Automatic 3d object segmentation in multiple views using volumetric graph-cuts. In: 18th British Machine Vision Conference, vol. 1, pp. 530–539 (2007)
Google Scholar
Goesele, M., Snavely, N., Curless, B., Hoppe, H., Seitz, S.M.: Multi-view stereo for community photo collections. In: ICCV, pp. 1–8 (2007)
Google Scholar
Ballan, L., Brusco, N., Cortelazzo, G.M.: 3D Content Creation by Passive Optical Methods. In: 3D Online Multimedia and Games: Processing, Visualization and Transmission. World Scientific Publishing, Singapore (2008)
Google Scholar
Seitz, S.M., Curless, B., Diebel, J., Scharstein, D., Szeliski, R.: A comparison and evaluation of multi-view stereo reconstruction algorithms. In: CVPR, pp. 519–528 (2006)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)
Article Google Scholar
Gallup, D., Frahm, J.M., Mordohai, P., Yang, Q., Pollefeys, M.: Real-time plane-sweeping stereo with multiple sweeping directions. In: CVPR (2007)
Google Scholar
Zach, C., Pock, T., Bischof, H.: A globally optimal algorithm for robust tv-l1 range image integration. In: ICCV (2007)
Google Scholar
Sheffer, A., Praun, E., Rose, K.: Mesh parameterization methods and their applications. Foundations and Trends in Computer Graphics and Vision 2, 105–171 (2006)
Article MATH Google Scholar
Brusco, N., Ballan, L., Cortelazzo, G.M.: Passive reconstruction of high quality textured 3D models of works of art. In: 6th International Symposium on Virtual Reality, Archeology and Cultural Heritage, VAST (2005)
Google Scholar
Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2000); ISBN: 0521623049
MATH Google Scholar
Arulampalam, M.S., Maskell, S., Gordon, N.: A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking. IEEE Trans. Signal Processing 50, 174–188 (2002)
Article Google Scholar
Sinha, S.N., Pollefeys, M.: Synchronization and calibration of camera networks from silhouettes. In: ICPR 2004: Proceedings of the Pattern Recognition, 17th International Conference on (ICPR 2004), vol. 1, pp. 116–119 (2004)
Google Scholar
Tuytelaars, T., Van Gool, L.: Synchronizing video sequences. In: CVPR, vol. 1, pp. 762–768 (2004)
Google Scholar
Reinhard, E., Adhikhmin, M., Gooch, B., Shirley, P.: Color transfer between images. IEEE Computer Graphics and Applications 21, 34–41 (2001)
Article Google Scholar
Baumberg, A., Hogg, D.: An efficient method for contour tracking using active shape models. In: Motion of Non-Rigid and Articulated Objects, pp. 194–199 (1994)
Google Scholar
Leibe, B., Cornelis, N., Cornelis, K., Gool, L.V.: Dynamic 3d scene analysis from a moving vehicle. In: CVPR (2007)
Google Scholar
Elgammal, A., Duraiswami, R., Harwood, D., Davis, L.S.: Background and foreground modeling using nonparametric kernel density estimation for visual surveillance. Proceedings of the IEEE 90, 1151–1163 (2002)
Article Google Scholar
Sheikh, Y., Javed, O., Kanade, T.: Background subtraction for freely moving cameras. In: ICCV (2009)
Google Scholar
Bai, X., Wang, J., Simons, D., Sapiro, G.: Video snapcut: robust video object cutout using localized classifiers. ACM Trans. Graph. 28 (2009)
Google Scholar
Wang, J., Bhat, P., Colburn, R.A., Agrawala, M., Cohen, M.F.: Interactive video cutout. ACM Trans. Graph. 24, 585–594 (2005)
Article Google Scholar
Sun, J., Zhang, W., Tang, X., Shum, H.Y.: Background cut. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 628–641. Springer, Heidelberg (2006)
Chapter Google Scholar
Ballan, L., Brostow, G.J., Puwein, J., Pollefeys, M.: Unstructured video-based rendering: Interactive exploration of casually captured videos. ACM Transactions on Graphics (Proceedings of SIGGRAPH), 1–11 (2010), http://doi.acm.org/10.1145/1833349.1778824
Taneja, A., Ballan, L., Pollefeys, M.: Modeling dynamic scenes recorded with freely moving cameras. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part III. LNCS, vol. 6494, pp. 613–626. Springer, Heidelberg (2011)
Chapter Google Scholar
Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 17, 790–799 (1995)
Article Google Scholar
Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. PAMI 26, 1124–1137 (2004)
Article MATH Google Scholar
Rav-Acha, A., Kohli, P., Rother, C., Fitzgibbon, A.: Unwrap mosaics: A new representation for video editing. ACM Transactions on Graphics (SIGGRAPH 2008) (2008)
Google Scholar
Chuang, Y.Y., Curless, B., Salesin, D.H., Szeliski, R.: A bayesian approach to digital matting. In: Proceedings of IEEE CVPR 2001, Kauai, Hawaii, vol. 2, pp. 264–271 (2001)
Google Scholar
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. PAMI 23, 1222–1239 (2001)
Article Google Scholar
Kolmogorov, V., Zabin, R.: What energy functions can be minimized via graph cuts? PAMI 26, 147–159 (2004)
Article Google Scholar
Lorensen, W.E., Cline, H.E.: Marching cubes: A high resolution 3d surface construction algorithm. SIGGRAPH 21, 163–169 (1987)
Article Google Scholar
Buehler, C., Bosse, M., McMillan, L., Gortler, S.J., Cohen, M.F.: Unstructured lumigraph rendering. In: SIGGRAPH, pp. 425–432 (2001)
Google Scholar
Grundland, M., Vohra, R., Williams, G.P., Dodgson, N.A.: Cross dissolve without cross fade: Preserving contrast, color and salience in image compositing. In: Proceedings of EUROGRAPHICS, Computer Graphics Forum, pp. 577–586 (2006)
Google Scholar
Schödl, A., Szeliski, R., Salesin, D.H., Essa, I.: Video textures. In: SIGGRAPH 2000: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, pp. 489–498 (2000)
Google Scholar
Rong, G., Tan, T.S.: Jump flooding in gpu with applications to voronoi diagram and distance transform. In: ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D), pp. 109–116. ACM, New York (2006)
Google Scholar
Wang, J., Bodenheimer, B.: Synthesis and evaluation of linear motion transitions. ACM Trans. Graph 27, 1–15 (2008)
Google Scholar
Debevec, P., Borshukov, G., Yu, Y.: Efficient view-dependent image-based rendering with projective texture-mapping. In: 9th Eurographics Workshop on Rendering (1998)
Google Scholar
Unstructured VBR, http://www.cvg.ethz.ch/research/unstructured-vbr/
Kilner, J., Starck, J., Hilton, A.: A comparative study of free-viewpoint video techniques for sports events. In: CVMP (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Vision and Geometry Group, ETH Zurich, Switzerland
Aparna Taneja, Luca Ballan, Jens Puwein & Marc Pollefeys
Department of Computer Science, University College London, UK
Gabriel J. Brostow

Authors

Aparna Taneja
View author publications
You can also search for this author in PubMed Google Scholar
Luca Ballan
View author publications
You can also search for this author in PubMed Google Scholar
Jens Puwein
View author publications
You can also search for this author in PubMed Google Scholar
Gabriel J. Brostow
View author publications
You can also search for this author in PubMed Google Scholar
Marc Pollefeys
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, TU München, Germany
Daniel Cremers
Computer Graphics Lab, Mühlenpfordstrasse 23, 38106, Braunschweig, Germany
Marcus Magnor
Technische Universität München, Germany
Martin R. Oswald
Technion, Haifa, Israel
Lihi Zelnik-Manor

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Taneja, A., Ballan, L., Puwein, J., Brostow, G.J., Pollefeys, M. (2011). 3D Reconstruction and Video-Based Rendering of Casually Captured Videos. In: Cremers, D., Magnor, M., Oswald, M.R., Zelnik-Manor, L. (eds) Video Processing and Computational Video. Lecture Notes in Computer Science, vol 7082. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24870-2_4

Download citation

DOI: https://doi.org/10.1007/978-3-642-24870-2_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24869-6
Online ISBN: 978-3-642-24870-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics