Skip to main content

Large-Scale Image Geolocalization

  • Chapter
  • First Online:
Multimodal Location Estimation of Videos and Images

Abstract

In this chapter, we explore the task of global image geolocalization—estimating where on the Earth a photograph was captured. We examine variants of the “im2gps” algorithm using millions of “geotagged” Internet photographs as training data. We first discuss a simple to understand nearest-neighbor baseline. Next, we introduce a lazy-learning approach with more sophisticated features that doubles the performance of the original “im2gps” algorithm. Beyond quantifying geolocalization accuracy, we also analyze (a) how the nonuniform distribution of training data impacts the algorithm (b) how performance compares to baselines such as random guessing and land-cover recognition and (c) whether geolocalization is simply landmark or “instance level” recognition at a large scale. We also show that geolocation estimates can provide the basis for image understanding tasks such as population density estimation or land cover estimation. This work was originally described, in part, in “im2gps” [9] which was the first attempt at global geolocalization using Internet-derived training data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    This value was calculated by counting the number of database photos close enough to each query in the test set. Alternatively, each geolocation guess has an area of 126,663 km\(^2\) and the land area of the Earth is 148,940,000 km\(^2\), suggesting that a truly uniform test set would have a chance guessing accuracy of 0.084 %. Chance is higher for our test set because our database (and thus test set) contain no photographs in some regions of Siberia, Sahara, and Antarctica.

References

  1. G. Baatz, O. Saurer, K.Köser, M. Pollefeys, Large scale visual geo-localization of images in mountainous terrain, In Proceedings of the 12th European Conference on Computer Vision - Volume Part II, (2012), pp. 517–530

    Google Scholar 

  2. M. Bar, The proactive brain: using analogies and associations to generate predictions. Trends Cogn. Sci. 11(7), 280–289 (2007)

    Article  Google Scholar 

  3. S.S. Chris Atkeson, Andrew Moore, Locally weighted learning. AI. Review 11, 11–73 (1997)

    Google Scholar 

  4. O. Chum, J. Philbin, J. Sivic, M. Isard, A. Zisserman, Total recall: Automatic query expansion with a generative feature model for object retrieval, in Proceedings of ICCV, 2007

    Google Scholar 

  5. D. Comaniciu, P. Meer, Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)

    Article  Google Scholar 

  6. D.J. Crandall, L. Backstrom, D. Huttenlocher, J. Kleinberg. Mapping the world’s photos, in WWW ’09: Proceedings of the 18th international conference on World wide web 2009, pp. 761–770, 2009

    Google Scholar 

  7. J. Hays, A. Efros. Where in the world? human and computer geolocation of images, in Vision sciences society meeting, 2009

    Google Scholar 

  8. J. Hays, A.A. Efros. Scene completion using millions of photographs, in ACM Transactions on Graphics (SIGGRAPH 2007), 26(3), 2007

    Google Scholar 

  9. J. Hays, A.A. Efros. im2gps: estimating geographic information from a single image, in CVPR, 2008

    Google Scholar 

  10. D. Hoiem, A. Efros, M. Hebert, Recovering surface layout from an image. Int. J. Comput. Vision. 75(1), 151–172 (2007)

    Article  Google Scholar 

  11. N. Jacobs, S. Satkin, N. Roman, R. Speyer, R. Pless, Geolocating static cameras, in Proceedings, ICCV, 2007

    Google Scholar 

  12. E. Kalogerakis, O. Vesselova, J. Hays, A.A. Efros, A. Hertzmann. Image sequence geolocation with human travel priors, in Proceedings of the IEEE International Conference on Computer Vision (ICCV ’09) (2009)

    Google Scholar 

  13. J. Kosecka, W. Zhang. Video compass, in ECCV ’02: Proceedings of the 7th European Conference on Computer Vision-Part IV, 2002, pp. 476–490

    Google Scholar 

  14. J.-F. Lalonde, D. Hoiem, A.A. Efros, C. Rother, J. Winn, A. Criminisi. Photo clip art. ACM Transactions on Graphics (SIGGRAPH 2007), vol. 26(3) (August 2007)

    Google Scholar 

  15. S. Lazebnik, C. Schmid, J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories, in CVPR (2006)

    Google Scholar 

  16. L.-J. Li, L.F. Fei, What, where and who? classifying events by scene and object recognition, in Proceedings, ICCV, (2007)

    Google Scholar 

  17. T.-Y. Lin, S. Belongie, J. Hays. Cross-view image geolocalization, in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Portland, OR, June 2013)

    Google Scholar 

  18. D. Lowe, Object recognition from local scale-invariant features. ICCV 2, 1150–1157 (1999)

    Google Scholar 

  19. J. Luo, D. Joshi, J. Yu, A. Gallagher, Geotagging in multimedia and computer visiona survey. Multime’d Tools Appl. 51, 187–211 (2011)

    Article  Google Scholar 

  20. D. Martin, C. Fowlkes, D. Tal, J. Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics, in Proceedings ICCV (July 2001)

    Google Scholar 

  21. J. Matas, O. Chum, M. Urban, T. Pajdla, Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2004)

    Article  Google Scholar 

  22. A. Oliva, A. Torralba, Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vision 42(3), 145–175 (2001)

    Article  MATH  Google Scholar 

  23. A. Oliva, A. Torralba. Building the gist of a scene: The role of global image features in recognition, in Visual Perception, Progress in Brain Research, 2006, vol. 155

    Google Scholar 

  24. J. Philbin, O. Chum, M. Isard, J. Sivic, A. Zisserman. Object retrieval with large vocabularies and fast spatial matching, in CVPR (2007)

    Google Scholar 

  25. J. Philbin, O. Chum, M. Isard, J. Sivic, A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2008)

    Google Scholar 

  26. T. Quack, B. Leibe, L. Van Gool. World-scale mining of objects and events from community photo collections, in CIVR ’08: Proceedings of the 2008 international conference on Content-based image and video retrieval (2008)

    Google Scholar 

  27. L.W. Renninger, J. Malik, When is scene recognition just texture recognition? Vis. Res. 44, 2301–2311 (2004)

    Article  Google Scholar 

  28. I. Simon, N. Snavely, S.M. Seitz. Scene summarization for online image collections, in Proceedings, ICCV (2007)

    Google Scholar 

  29. J. Sivic, A. Zisserman, Video Google: A text retrieval approach to object matching in videos. ICCV 2, 1470–1477 (2003)

    Google Scholar 

  30. N. Snavely, S.M. Seitz, R. Szeliski, Photo tourism: exploring photo collections in 3d. ACM Trans. Graph. 25(3), 835–846 (2006)

    Article  Google Scholar 

  31. R. Szeliski. “Where am I?”: ICCV 2005 Computer Vision Contest. http://research.microsoft.com/iccv2005/Contest/

  32. W. Thompson, C. Valiquette, B. Bennett, K. Sutherland, Geometric reasoning for map-based localization. Spatial Cogn. Comput 1(3), 291–321 (1999)

    Google Scholar 

  33. A. Torralba, R. Fergus, W.T. Freeman, 80 million tiny images: a large dataset for non-parametric object and scene recognition. IEEE PAMI 30(11), 1958–1970 (2008)

    Article  Google Scholar 

  34. J. Vogel, B. Schiele, Semantic modeling of natural scenes for content-based image retrieval. Int. J. Comput. Vis. 72(2), 133–157 (2007)

    Article  Google Scholar 

  35. J. Xiao, J. Hays, K. Ehinger, A. Oliva, A. Torralba. Sun database: Large-scale scene recognition from abbey to zoo, in CVPR (2010)

    Google Scholar 

  36. H. Zhang, A.C. Berg, M. Maire, J. Malik. Svm-knn: Discriminative nearest neighbor classification for visual category recognition, in CVPR ’06 (2006)

    Google Scholar 

  37. W. Zhang, J. Kosecka. Image based localization in urban environments, in 3DPVT ’06 (2006)

    Google Scholar 

  38. Y. Zheng, M. Zhao, Y. Song, H. Adam, U. Buddemeier, A. Bissacco, F. Brucher, T.-S. Chua, H. Neven. Tour the world: building a web-scale landmark recognition engine, in CVPR (2009)

    Google Scholar 

Download references

Acknowledgments

We thank Steve Schlosser, Julio Lopez, and Intel Research Pittsburgh for helping us overcome the logistical and computational challenges of this project. All visualizations and geographic data sources are derived from NASA data. Funding for this work was provided by an NSF fellowship to James Hays and NSF grants CAREER 1149853, CAREER 0546547, and CCF-0541230.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to James Hays .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Hays, J., Efros, A.A. (2015). Large-Scale Image Geolocalization. In: Choi, J., Friedland, G. (eds) Multimodal Location Estimation of Videos and Images. Springer, Cham. https://doi.org/10.1007/978-3-319-09861-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09861-6_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09860-9

  • Online ISBN: 978-3-319-09861-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics