Robust video mosaicing through topology inference and local to global alignment
The problem of piecing together individual frames in a video sequence to create seamless panoramas (video mosaics) has attracted increasing attention in recent times. One challenge in this domain has been to rapidly and automatically create high quality seamless mosaics using inexpensive cameras and relatively free hand motions.
In order to capture a wide angle scene using a video sequence of relatively narrow angle views, the scene needs to be scanned in a 2D pattern. This is like painting a canvas on a 2D manifold with the video frames using multiple connected 1D brush strokes. An important issue that needs to be addressed in this context is that of aligning frames that have been captured using a 2D scanning of the scene rather than a 1D scan as is commonly done in many existing mosaicing systems.
In this paper we present an end-to-end solution to the problem of video mosaicing when the transformations between frames may be modeled as parametric. We provide solutions to two key problems: (i) automatic inference of topology of the video frames on a 2D manifold, and (ii) globally consistent estimation of alignment parameters that map each frame to a consistent mosaic coordinate system. Our method iterates among automatic topology determination, local alignment, and globally consistent parameter estimation to produce a coherent mosaic from a video sequence, regardless of the camera's scan path over the scene. While this framework is developed independent of the specific alignment model, we illustrate the approach by constructing planar and spherical mosaics from real videos.
KeywordsVideo Sequence Lens Distortion Alignment Parameter Neighboring Frame Local Registration
Unable to display preview. Download preview PDF.
- J. R. Bergen et al. Hierarchical model-based motion estimation. In Proc. 2nd European Conference on Computer Vision, pages 237–252, 1992.Google Scholar
- R. I. Hartley. Self-calibration from multiple views with a rotating camera. In ECCV, pages 471–478, 1994.Google Scholar
- Apple Computer Inc. An overview of apple's QuickTime VR technology, 1995. http://quicktime.apple.com/qtvr/qtvrtech5_25.html.Google Scholar
- M. Irani, P. Anandan, and S. Hsu. Mosaic based representations of video sequences and their applications. In Proc. Intl. Conf. on Computer Vision, pages 605–611, 1995.Google Scholar
- S. B. Kang and R. Weiss. Characterization of errors in compositing panoramic images. In Proc. Computer Vision and Pattern Recognition Conference, pages 103–109, 1997.Google Scholar
- S. Mann and R. W. Picard. Virtual bellows: Constructing high quality stills from video. In ICIP, 1994.Google Scholar
- L. McMillan and G. Bishop. Plenoptic modeling: An image-based rendering system. In Proc. of SIGGRAPH, pages 39–46, 1995.Google Scholar
- S. Peleg and J. Herman. Panoramic mosaics by manifold projection. In CVPR, pages 338–343, 1997.Google Scholar
- H. S. Sawhney, S. Ayer, and M. Gorkani. Model-based 2D&3D dominant motion estimation for mosaicing and video representation. In Proc. Intl. Conf. on Computer Vision, pages 583–590, 1995. ftp://eagle.almaden.ibm.com/pub/cs/reports/vision/dominant_motion.ps.Z.Google Scholar
- H. S. Sawhney and R. Kumar. True multi-image alignment and its application to mosaicing and lens distortion. In CVPR, pages 450–456, 1997.Google Scholar
- C. C. Slama. Manual of Photogrammetry. Amer. Soc. of Photogrammetry, Falls Church, VA, 1980.Google Scholar
- R. Szeliski. Image mosaicing for tele-reality applications. In IEEE Wkshp. on Applications of Computer Vision, pages 44–53, 1994.Google Scholar
- R. Szeliski and H. Shum. Creating full view panoramic image mosaics and environment maps. In Proc. of SIGGRAPH, pages 251–258, 1997.Google Scholar
- L. A. Teodosio and W. Bender. Salient video stills: Content and context preserved. In ACM Intl. Conf. on Multimedia, 1993.Google Scholar
- VideoBrush. http://www.videobrush.com.Google Scholar
- Y. Xiong and K. Turkowski. Creating image-based VR using a self-calibrating fisheye lens. In Proc. Computer Vision and Pattern Recognition Conference, pages 237–243, 1997.Google Scholar