Where the Photos Were Taken: Location Prediction by Learning from Flickr Photos

Li, Li-Jia; Jha, Rahul Kumar; Thomee, Bart; Shamma, David Ayman; Cao, Liangliang; Wang, Yang

doi:10.1007/978-3-319-25781-5_3

Li-Jia Li⁷,
Rahul Kumar Jha⁸,
Bart Thomee⁹,
David Ayman Shamma⁹,
Liangliang Cao¹⁰ &
…
Yang Wang¹¹

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

1692 Accesses
1 Citations
1 Altmetric

Abstract

In this chapter, we explore the characteristics of geographically tagged Internet photos and determine their location based on the visual content. We develop a principled machine learning model to estimate geographical locations of photos by modeling the relationship between location and the photo content. To build reliable geographical estimators, it is important to find distinguishable geographical clusters in the world. These clusters cover general geographical regions not limited to just landmarks. Geographical clusters provide more training samples and hence lead to better recognition accuracy. We develop a framework for geographical cluster estimation, and employ latent variables to estimate the geographical clusters. To solve this estimation problem, we propose to build an efficient solver to find the latent clusters. We illustrate detailed qualitative results obtained from beaches photos taken at different continents. In addition, we show significantly improved quantitative results over other approaches for recognizing different beaches using the Flickr beach dataset as validation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Hardcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Hays J, Efros AA (2008) Im2gps: estimating geographic information from a single image. In: IEEE conference on computer vision and pattern recognition
Google Scholar
Crandall D, Backstrom L, Huttenlocher D, Kleinberg J (2009) Mapping the world’s photos. In: International conference on world wide web, pp 761–770
Google Scholar
Chen W, Battestini A, Gelfand N, Setlur V (2009) Visual summaries of popular landmarks from community photo collections. In: ACM international conference on Multimedia, pp 789–792
Google Scholar
Yin Z, Cao L, Han J, Zhai C, Huang T (2011) Geographical topic discovery and comparison. In: Proceedings of the 20th international conference on world wide web. ACM, pp 247–256
Google Scholar
Zheng Y, Zhao M, Song Y, Adam H, Buddemeier U, Bissacco A, Brucher F, Chua T, Neven H (2009) Tour the World: building a web-scale landmark recognition engine. In: IEEE conference on computer vision and pattern recognition
Google Scholar
Cao L, Smith J, Wen Z, Yin Z, Jin X, Han J (2012) BlueFinder: estimate where a beach photo was taken. In: WWW
Google Scholar
Wang Y, Cao L (2013) Discovering latent clusters from geotagged beach images. In: Advances in multimedia modeling. Springer, pp 133–142
Google Scholar
Naaman M, Song Y, Paepcke A, Garcia-Molina H (2004) Automatic organization for digital photographs with geographic coordinates. In: International conference on digital libraries, vol 7. pp 53–62
Google Scholar
Agarwal M, Konolige K (2006) Real-time localization in outdoor environments using stereo vision and inexpensive GPS. In: International conference on pattern recognition
Google Scholar
Cao L, Yu J, Luo J, Huang T (2009) Enhancing semantic and geographic annotation of web images via logistic canonical correlation regression. In: Proceedings of the seventeen ACM international conference on multimedia, pp 125–134
Google Scholar
Yu J, Luo J (2008) Leveraging probabilistic season and location context models for scene understanding. In: International conference on content-based image and video retrieval, pp 169–178
Google Scholar
Joshi D, Luo J (2008) Inferring generic places based on visual content and bag of geotags. In: ACM conference on content-based image and video retrieval
Google Scholar
Yuan J, Luo J, Wu Y (2008) Mining compositional features for boosting. In: IEEE conference on computer vision and pattern recognition
Google Scholar
Kennedy L, Naaman M, Ahern S, Nair R, Rattenbury T (2007) How flickr helps us make sense of the world: context and content in community-contributed media collections. In: ACM conference on multimedia
Google Scholar
Naaman M (2005) Leveraging geo-referenced digital photographs. PhD thesis, Stanford University
Google Scholar
Quack T, Leibe B, Van Gool L (2008) World-scale mining of objects and events from community photo collections. In: ACM conference on image and video retrieval, pp 47–56
Google Scholar
Luo J, Yu J, Joshi D, Hao W (2008) Event recognition: viewing the world with a third eye. In: ACM international conference on multimedia, pp 1071–1080
Google Scholar
Schindler G, Krishnamurthy P, Lublinerman R, Liu Y, Dellaert F (2008) Detecting and matching repeated patterns for automatic geo-tagging in urban environments. In: IEEE conference on computer vision and pattern recognition
Google Scholar
Cao L, Luo J, Gallagher A, Jin X, Han J, Huang T (2010) A worldwide tourism recommendation system based on geotagged web photos. In: International conference on acoustics, speech, and signal processing (ICASSP)
Google Scholar
Bush V (1945) As we may think. The Atlantic Monthly
Google Scholar
Agarwal S, Snavely N, Simon I, Seitz SM, Szeliski R (2009) Building rome in a day. In: International conference on computer vision
Google Scholar
Ji R, Xie X, Yao H, Ma WY (2009) Mining city landmarks from blogs by graph modeling. In: ACM Multimedia, pp 105–114
Google Scholar
Gallagher A, Joshi D, Yu J, Luo J (2009) Geo-location inference from image content and user tags. In: Workshop on internet vision
Google Scholar
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part based models. IEEE Trans Pattern Anal Mach Intell 32:1672–1645
Google Scholar
Xu L, Neufeldand J, Larson B, Schuurmans D (2005) Maximum margin clustering. In Saul LK, Weiss Y, Bottou L (eds) Advances in neural information processing systems, vol 17. MIT Press, Cambridge, MA, pp 1537–1544
Google Scholar
Choi J, Lei H, Ekambaram V, Kelm P, Gottlieb L, Sikora T, Ramchandran K, Friedland G (2013) Human vs machine: establishing a human baseline for multimodal location estimation. In: Proceedings of the 21st ACM international conference on multimedia, MM ’13 pp 867–876
Google Scholar
Xu L, Wilkinson D, Southey F, Schuurmans D (2006) Discriminative unsupervised learning of structured predictors. In: Proceedings of the 23th international conference on machine learning
Google Scholar
Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res
Google Scholar

Download references

Author information

Authors and Affiliations

Yahoo! Research, Sunnyvale, USA
Li-Jia Li
University of Michigan, Ann Arbor, USA
Rahul Kumar Jha
Yahoo! Research, San Francisco, USA
Bart Thomee & David Ayman Shamma
IBM Watson Research, New York, USA
Liangliang Cao
University of Manitoba, Winnipeg, Canada
Yang Wang

Authors

Li-Jia Li
View author publications
You can also search for this author in PubMed Google Scholar
Rahul Kumar Jha
View author publications
You can also search for this author in PubMed Google Scholar
Bart Thomee
View author publications
You can also search for this author in PubMed Google Scholar
David Ayman Shamma
View author publications
You can also search for this author in PubMed Google Scholar
Liangliang Cao
View author publications
You can also search for this author in PubMed Google Scholar
Yang Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li-Jia Li .

Editor information

Editors and Affiliations

Computer Science Department, Stanford University Computer Science Department, Stanford, California, USA
Amir R. Zamir
Decisive Analytics Corporation, Arlington, Virginia, USA
Asaad Hakeem
ETH Zürich, Zürich, Switzerland
Luc Van Gool
University of Central Florida, Orlando, Florida, USA
Mubarak Shah
Facebook, Seattle, Washington, USA
Richard Szeliski

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Li, LJ., Jha, R.K., Thomee, B., Shamma, D.A., Cao, L., Wang, Y. (2016). Where the Photos Were Taken: Location Prediction by Learning from Flickr Photos. In: Zamir, A., Hakeem, A., Van Gool, L., Shah, M., Szeliski, R. (eds) Large-Scale Visual Geo-Localization. Advances in Computer Vision and Pattern Recognition. Springer, Cham. https://doi.org/10.1007/978-3-319-25781-5_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-25781-5_3
Published: 06 July 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25779-2
Online ISBN: 978-3-319-25781-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics