Abstract
The goal of automatic recovery of camera motion and scene structure from video sequences has been a staple of computer vision research for over a decade. As an area of endeavour, it has seen both steady and explosive progress over time, and now represents one of the success stories of computer vision. This task, automatic camera tracking or “matchmoving”, is the sine qua non of modern special effects, allowing the seamless insertion of computer generated objects onto live-action backgrounds (figure 2.1 shows an example). It has moved from a research problem for a small number of uncalibrated images to commercial software which can automatically track cameras through thousands of frames [1]. In addition, camera tracking is an important preprocess for many computer vision algorithms such as multiple-view shape reconstruction, novel view synthesis and autonomous vehicle navigation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
2d3 Ltd. http://www.2d3.com 2002.
S. Avidan and A. Shashua. Threading fundamental matrices. In Proceedings of the European Conference on Computer Vision, pages 124–140. Springer-Verlag, 1998.
P. A. Beardsley, P. H. S. Torr, and A. Zisserman. 3D model acquisition from extended image sequences. In Proceedings of the 4th European Conference on Computer Vision, LNCS 1065, Cambridge, pages 683–695, 1996.
M. J. Black and P. Anandan. A framework for the robust estimation of optical flow. In Proceedings of the 4th International Conference on Computer Vision, Berlin, pages 231–236, 1993.
D. Bondyfalat and S. Bougnoux. Imposing euclidean constraints during self-calibration processes. In R. Koch and L. Van Gool, editors, 3D Structure from Multiple Images of Large-Scale Environments. LNCS 1506. Springer-Verlag, 1998.
M. Brand. Morphable 3d models from video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages II: 456–463, 2001.
D.C. Brown. Decentering distortion of lenses. Photogrammetric Engineering, 32(3):444–462, 1966.
A. Chiuso, P. Favaro, H. Jin, and S. Soatto. MfM: 3-D motion and structure causally integrated over time. In Proceedings of the 6th European Conference on Computer Vision, Dublin, Ireland, 2000.
J. C. Clarke and A. Zisserman. Detecting and tracking of independent motion. Image and Vision Computing, 14:565–572, 1996.
F. Devemay and O. D. Faugeras. Automatic calibration and removal of distortion from scenes of structured environments. In SPlE, volume 2567, San Diego, CA, July 1995.
O. Faugeras and Q.-T. Luong. The Geometry of Multiple Images. MIT Press, 2001.
A. W. Fitzgibbon. Simultaneous linear estimation of multiple view geometry and lens distortion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2001.
A. W. Fitzgibbon, G. Cross, and A. Zisserman. Automatic 3D model construction for turn-table sequences. In R. Koch and L. Van Gool, editors, 3D Structure from Multiple Images of Large-Scale Environments, LNCS 1506, pages 155–170. Springer-Verlag, June 1998.
A. W. Fitzgibbon and A. Zisserman. Automatic camera recovery for closed or open image sequences. In Proceedings of the European Conference on Computer Vision, pages 311–326. Springer-Verlag, June 1998.
A. W. Fitzgibbon and A. Zisserman. Multibody structure and motion: 3-D reconstruction of independently moving objects. In Proceedings of the European Conference on Computer Vision, pages 891–906. Springer-Verlag, June 2000.
S. Gibson, J. Cook, T. L. J. Howard, R. J. Hubbold, and D. Dram. Accurate camera calibration for off-line, video-based augmented reality. In IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR 2002), September 2002.
R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, ISBN: 0521623049, 2000.
D. Jacobs. Linear fitting with missing data: Applications to structure from motion and to characterizing intensity images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 206–212, 1997.
F. Kahl. Geometry and Critical Configurations of Multiple Views. PhD thesis, Lund Institute of Technology, 2001.
F. Kahl, R. I. Hartley, and K. Åström. Critical configurations for n-view projective reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii, 2001.
R. Koch, M. Pollefeys, B. Heigl, L. Van Gool, and H. Niemann. Calibration of hand-held camera sequences for plenoptic modeling. In Proceedings of the 7th International Conference on Computer Vision, Kerkyra, Greece, pages 585–591, 1999.
S. Laveau. Geometry of a system of N cameras. Theory, estimation, and applications. PhD thesis, INRIA, 1996.
D. Martinec and T. Pajdla. Structure from many perspective images with occlusions. In Proceedings of the European Conference on Computer Vision, pages 355–369, 2002.
P. McLauchlan. A batch/recursive algorithm for 3D scene reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Hilton Head Island, South Carolina, pages II: 738–743, 2000.
P. McLauchlan, X. Shen, P. Palmer, A. Manessis, and A. Hilton. Surface-based structure-from-motion using feature groupings. In Proceedings of the Asian Conference on Computer Vision, 2000.
T. Moons, L. Van Gool, M. Van Diest, and A. Oosterlinck. Affine structure from perspective image pairs obtained by a translating camera. In J. L. Mundy, A. Zissennan, and D. Forsyth, editors, Applications of invariance in computer vision, pages 297–316. Springer-Verlag, 1994.
D. Nister. Reconstruction from uncalibrated sequences with a hierarchy of trifocal tensors. In Proceedings of the European Conference on Computer Vision, 2000.
D. Nister. Automatic Dense Reconstruction from Uncalibrated Video Sequences. PhD thesis, Dept. of Numerical Analysis and Computing Science, KTH Stockholm, 2001.
M. Pollefeys, F. Verbiest, and L. J. Van Gool. Surviving dominant planes in uncalibrated structure and motion recovery. In ECCV (2), pages 837–851, 2002.
H. S. Sawhney, S. Hsu, and R. Kumar. Robust video mosaicing through topology inference and local to global alignment. In Proceedings of the European Conference on Computer Vision, pages 103–119. Springer-Verlag, 1998.
R. A. Smith, A. W.Fitzgibbon, and A. Zisserman. Improving augmented reality using image and scene constraints. In Proceedings of the 10th British Machine Vision Conference, Nottingham, pages 295–304. BMVA Press, 1999.
G. P. Stein. Lens distortion calibration using point correspondences. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Puerto Rico, 1997.
P. Sturm. On focal length calibration from two views. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii, 2001.
P. Sturm and W.Triggs. A factorization based algorithm for multi-image projective structure and motion. In Proceedings of the 4th European Conference on Computer Vision, Cambridge, UK, pages 709–720, 1996.
C. Tomasi and T. Kanade. Shape and motion from image streams under orthography: A factorization approach. International Journal of Computer Vision, 9(2):137–154, November 1992.
B. Tordoff and D. W. Murray. Violating rotating camera geometry: The effect of radial distortion on self-calibration. In Proceedings of the International Conference on Pattern Recognition, 2000.
P. H. S. Torr, A. W.Fitzgibbon, and A. Zisserman. The problem of degeneracy in structure and motion recovery from uncalibrated image sequences. International Journal of Computer Vision, 32(1):27–44, August 1999.
P. H. S. Torr and D. W. Murray. Stochastic motion clustering. In Proceedings of the 3rd European Conference on Computer Vision, Stockholm, Sweden, pages 328–337. Springer-Verlag, 1994.
L. Torresani, D. Yang, G. Alexander, and C. Bregler. Tracking and modeling non-rigid objects with rank constraints. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2001.
W. Triggs, P. McLauchlan, R. Hartley, and A. Fitzgibbon. Bundle adjustment: A modern synthesis. In W. Triggs, A. Zisserrnan, and R. Szeliski, editors, Vision Algorithms: Theory and Practice, LNCS. Springer -Verlag, 2000.
J. Weber and J. Malik. Rigid body segmentation and shape description from dense optical flow under weak perspective. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(2):139–143, 1997.
L. Wolf and A. Shashua. On projection matrices ℙk → ℙ2, k = 3, ⋯, 6, and their applications in computer vision. In Proceedings of the International Conference on Computer Vision, 2001.
Z. Zhang. On the epipolar geometry between two images with lens distortion. In Proceedings of the International Conference on Pattern Recognition, pages 407–411, 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer Science+Business Media New York
About this chapter
Cite this chapter
Fitzgibbon, A.W., Zisserman, A. (2003). Automatic Camera Tracking. In: Shah, M., Kumar, R. (eds) Video Registration. The International Series in Video Computing, vol 5. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0459-7_2
Download citation
DOI: https://doi.org/10.1007/978-1-4615-0459-7_2
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-5087-3
Online ISBN: 978-1-4615-0459-7
eBook Packages: Springer Book Archive