Abstract
Video information can provide an inexpensive source of information about the world. For many applications such as surveillance, situation awareness and navigation, the utility of this video information is increased if we are able to assign precise geocoordinates to the pixels in the video acquired from an airborne platform. Many video-capture platforms have physical sensors which can give an approximate relationship between the video and the world. For example, unmanned aerial vehicles (UAV) can transmit to a ground control station, together with the video, some telemetry information given in terms of the location and attitude of the platform relative to a world coordinate system, focal length and pose of the camera. While this telemetry, or Engineering Support Data (ESD), is very useful in giving a preliminary alignment of the video to the world, it may have limited accuracy due to a variety of factors: very precise inertial navigation system (INS) are generally expensive; the weight of the inertial measurement unit (IMU), which for some implementations prescribes the accuracy, can be limited by the maximum payload of the platform; a small jitter in the cameras can translate to significant ground errors for oblique or high altitude plat-forms. Video processing has the potential of improving on the precision of the alignment beyond what can be obtained with physical sensors alone.
This research was supported by, and data was provided through, the U.S. Naval Air Systems Command under contract N00019-99-C-1385 and by DARPA contract DAAB07-98-C-J0243.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Beardsley, P.J and Torr P. and Zisserman, A.. (1996). “3D Model Acquisition from Extended Image Sequences,” Proc. ECCV, vol. 2., 683–695.
Ben-Ezra M. and Peleg S. and M. Werman. (1998). “Robust realtime motion analysis,” In Proceedings of the DARPA Image Understanding Workshop, 207–210.
Bergen, J. and Anandan, P. and Hanna and R. Hingorani. (1992). “Hierarchical model-based motion estimation,” In Proceedings of the European Conference on Computer Vision, 237–252.
Brown L. (1992). “A survey of image registration techniques”, ACM Computing Surveys 24 (2): 325–376.
Burt, P.J. and Adelson, E.H. (1983). “A Multiresolution Spline with Applications to Image Mosaics,” ACM Trans. Graphics 2(4):217–236.
Burt, P. and Hansen, M. and Hsu, S. and Kumar, R. and Lehman, B. and Matei, B. and Mishra, D. and Shan, Y. and Wildes, R. and Zhao, W.. (2001). “Real-time, automatic precision video georegistration,” In Proceedings of the Association for Unmanned Vehicle Systems — International Unmanned Systems Symposium.
Comaniciu, D. and Meer, P. (1999). “Mean shift analysis and applications,” In Proceedings of the IEEE International Conference on Computer Vision, 1197–1203.
Coorg, S., Teller, S. (1999). “Extracting Textured Vertical Facades from Controlled Close-Range Imagery,” Proc. CVPR, vol. 2., 625–632.
Coxeter, H. (1994). Projective Geometry. Springer, Berlin.
Debevec, P. and Taylor, C. and Malik, J.(1996). “Modeling and Rendering Architecture from Photographs: A Hybrid Geometry-and Image-Based Approach,” Proc. SIGGRAPH, 11–20.
Drummond, T. and Cipolla, R. (1999). “Real-Time Tracking of Complex Structures for Visual Servoing,” Proc. ICCV 99 Vision Algorithms Workshop, 91–98.
Fischler, M. and Bolles, R. (1981). “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography”, Communications of the ACM 24 (6): 381–395.
Foley, J and Dam, A. van and Feiner, S and Highes, J. (1990). Computer Graphics. Addison-Wesley, Reading, MA.
Freeman, W and Adelson, E. (1991). “The design and use of steerable filters”, IEEE Transactions on Pattern Analysis and Machine Intelligence 13 (9): 891–906.
Hansen, M and Anandan, P. and Dana, K and Wal, G. van der and Burt, P. (1994). “Real-time scene stabilization and mosaic construction”, In Proceedings of the IEEE Workshop on Applications of Computer Vision, 54–63.
Harris, C and Stephens, M. (1988). “A combined corner and edge detector”, Proceedings of Alvey Vision Conference, 147–151.
Horn, B.K.P. and Weldon, E.J. (1988). “Direct Methods for Recovering Motion,” IJCV 2(1):51–76.
Horn, B. (1986) Robot Vision. MIT Press, Cambridge, MA.
Hsu, S. and Samarasekera, S. and Kumar, R. and Sawhney, H.S. (2000) “Pose Estimation, Model Refinement, and Enhanced Visualization Using Video,” In Proc. CVPR 00, Hilton Head Is., SC, vol. I, 488–495.
Gill, P and Murray, W and Wright, M. (1981) Practical Optimization, Academic, NY, NY.
Hirvonen, D. and Matei, B. and Wildes R. and Hsu, S. (2001). “Video to reference image alignment in the resence of sparse features and appearance change,” Proc. CVPR 2001, Hawaii, Vol. II, 366:373.
Horn, B. and Bachman, B. (1978). “Using synthetic images to register real images with surface models”, Communications of the ACM 21: 914–924.
Jähne, B. (1988). Digital Image Processing, Springer, Berlin.
Irani, M and Anandan, P. (1998). “Robust multi-sensor image alignment”, In Proceedings of the IEEE International Conference on Computer Vision, 959–966.
Jebara, T and Azarbayejani, A. and Pentland, A. (1999) “3D Structure from Motion,” IEEE Signal Processing Mag. 16(3):66–84.
Knutsson, H. and Wilson, R. and Granlund, G. (1983). “Anisotropic non-stationary image estimation and its applications — part I: Restoration of noisy images”, IEEE Transactions on Communications 31: 388–397.
Kollnig, H. and Nagel, H.H. (1995). “3D Pose Estimation by Fitting Image Gradients Directly to Polyhedral Models,” Proc. ICCV, 569–574.
Kumar, R and Hanson, A.R. (1994). “Robust Methods for Estimating Pose and a Sensitivity Analysis,” CVGIP: Image Understanding 60(3):313–342.
Kumar, R et al. (1994b). “Frameless registration of MR and CT 3D volumetric data sets”, In Proceedings of the IEEE Workshop on Applications of Computer Vision, 240–249
Kumar, R and Sawhney, H.S. and Asmuth, J.C. and Pope, A. and Hsu, S. (1998). “Registration of video to geo-referenced imagery,” Proc. ICPR, vol. 2, 1393–1400.
Kumar, R. et al. (2001) “Aerial Video Surveillance and Exploitation”, Proceedings of the IEEE, Special Issue on Third Generation Surveillance Systems, 89(10):1518–1539.
Li, H. and Manjunath, B.S. and Mitra, S.K. (1995). “A contour based approach to multisensor image registration,” IEEE Transactions on Image Processing, 320–334.
Li, H. and Zhou, Y. (1995). “Automatic EO/IR sensor image registration”, In Proceedings of the IEEE International Conference on Image Processing, 161–164.
Lowe, D. (1992). “Robust Model-Based Motion Tracking Through the Integration of Search and Estimation,” IJCV 8(2):113–122.
Lucas, B.D. and Kanade, T. (1981) “An Iterative Image-Registration Technique with an Application to Stereo Vision,” Proc. Image Understanding Workshop, 121–130.
Maes, F. and Collignon, A. and van der Meulen, D. and Marchal, G. and Suetens, P. (1997). “Multimodal image registration by maximizing mutual information”, IEEE Transactions on Medical Imaging 16: 187–198.
Marchand, E. and Bouthemy, P. and Chaumette, F. and Moreau, V. (1999). “Robust Real-Time Visual Tracking using a 2D-3D Model-Based Approach,” Proc. ICCV, vol. 1, 262–268.
Merhav, S. and Bresler, Y. (1986). On-line vehicle motion estimation from visual terrain information, Part I: Ground velocity and position estimation. IEEE Transactions on Aerospace Electronic Systems 22 (5): 588–604.
Politis, D.N. (1998). “Computer-Intensive Methods in Statistical Analysis,” IEEE Signal Processing Mag., 15(1): 39–55.
Pope, P. and Scarpace, F. (2000). “Development of a method to geographically register airborne scanner imager”, In Proceedings of the American Society of Photogrammetry and Remote Sensing Conference.
Rodriguez, J and Aggarwal, J. (1990). “Matching aerial images to 3D terrain maps”, IEEE Transactions on Pattern Analysis and Machine Intelligence 12 (12): 1138–1149.
Schmid, C. and Mohr, R. and Bauckhage, C. (2000). “Evaluation of interest point detectors”, International Journal of Computer Vision 37: 151–172.
Shum, H.Y. and Szeliski, R. and Baker, S. and Han, M and Anandan, P. (1998) “Interactive 3D Modeling from Multiple Images using Scene Regularities,” Proc. ECCV 98 Workshop on 3D Structure from Multiple Images of Large-Scale Environments, Lecture Notes in Computer Science No. 1506, 236–252.
Sim, D. and Park, R. (1998). “Localization based on the gradient information for DEM matching,” Proceedings of IAPR Workshop on Machine Vision Applications, 266–269.
Siouris, G. (1993). Aerospace Avionics Systems: A Modern Synthesis. Academic Press, San Diego, CA.
Szeliski, R. (1994). “Image mosaicing for tele-reality applications”, In Proceedings of the IEEE Workshop on Applications of Computer Vision, 44–63.
Tao, H. and Sawhney, H.S. and Kumar, R. (2001). “Dynamic depth recovery from multiple syncronized video streams”, In Proc. CVPR 2001, Hawaii, Vol. II, 118–124.
Torr, P. and Zisserman, A. (2000). “MLESAC: A new robust estimator with application to estimating image geometry”, Computer Vision and Image Understanding 78: 138–156.
Torr, P. and Davidson, C. (2000). “IMPSAC: A synthesis of importance sampling and random sample consensus to effect multi-scale image matching for small and wide baselines”, In Proceedings of the European Conference on Computer Vision, 819–833.
Triggs, B and McLaughlan, P. and Hartley, R. and Fitzgibbon, A. (1999) “Bundle adjustment — A modern synthesis”, In Proceedings of the IEEE Workshop on Computer Vision Algorithms, 298–372.
USGS. (n.d.). http://mapping.usgs.gov/digitalbackyard/faqsnew.html.
Wells III, W. and Viola, P. and Atsumi, H. and Nakajima, S. and Kikinis, R. (1996). Multimodal volume registration by maximization of mutual information. Medical Image Analysis 1: 35–52.
Wildes, R and Hirvonen, D. and Hsu, S. and Kumar, R. and Lehman, W. and Matei B. and Zhao. W. (2001). Video georegistration: Algorithm and quantitative evaluation. In Proceedings of the IEEE International Conference on Computer Vision, 343–350.
Wildes, R. and Bergen, J. (2000). “Qualitative spatiotemporal analysis with oriented energy representation”, In Proceedings of the European Conference on Computer Vision, 768–784.
Wolf, P. (1993). Elements of Photogrammetry. McGraw, NY.
Yuille, A.L. and Cohen, D.S. and Hallinan, P.W. (1989). “Feature Extraction from Faces Using Deformable Templates,” Proc. CVPR, 104–109.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer Science+Business Media New York
About this chapter
Cite this chapter
Matei, B. et al. (2003). Robust Video Georegistration in the Presence of Significant Appearance Changes. In: Shah, M., Kumar, R. (eds) Video Registration. The International Series in Video Computing, vol 5. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0459-7_8
Download citation
DOI: https://doi.org/10.1007/978-1-4615-0459-7_8
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-5087-3
Online ISBN: 978-1-4615-0459-7
eBook Packages: Springer Book Archive