Multi-source fusion based geo-tagging for web images

Ma, Xiang; Zhao, Yisi; Qian, Xueming; Tang, Yuan Yan

doi:10.1007/s11042-017-5211-y

Multi-source fusion based geo-tagging for web images

Published: 17 September 2017

Volume 77, pages 16399–16417, (2018)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Xiang Ma¹,
Yisi Zhao²,
Xueming Qian ORCID: orcid.org/0000-0002-3173-6307^2,3 &
…
Yuan Yan Tang⁴

315 Accesses
2 Citations
Explore all metrics

Abstract

Geographic locations estimation for web images have been received a lot of attention in recent years. With the help of smart phone, it is very popular for us to capture photos and share them in our social media networks. Users often generate several tags to describe image content. Many images are embedded with with geo-tags. In this paper, we propose an effective image GPS (geo-coordinates or geo-tags) estimation approach by fusing the multi-source such as textual, temporal and visual features of web images. We propose a hierarchical strategy to inference the GPS of web image. We preselect several geographic locations of higher expected relevance and perform a deeper analysis inside the selected locations to return the coordinates most likely to be related to the input image by an enhanced language model. Experiments show the effectiveness of our proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image Matching from Handcrafted to Deep Features: A Survey

Article Open access 04 August 2020

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Article Open access 06 February 2017

CLIP-Adapter: Better Vision-Language Models with Feature Adapters

Article 15 September 2023

References

Agarwal S, Verma A (2013) Content based image retrieval using discrete wavelet transform and edge histogram descriptor. In: ISCON, pp 19–23
Ames M, Naaman M (2007) Why we tag: motivations for annotation in mobile and online media. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp 971–980
Chakravarti R, Meng X (2009) A study of color histogram based image retrieval. International Conference on Information Technology, New Generations
Book Google Scholar
Chen B, Shu H, Coatrieux G, Chen G, Sun X, Coatrieux J (2015) Color image analysis by quaternion-type moments. Journal of Mathematical Imaging and Vision 51(1):124–144
Article MathSciNet MATH Google Scholar
Cheng G, Han J, Guo L, Qian X, Zhou P, Yao X, Hu X (2013) Object detection in remote sensing images using a discriminatively trained mixture model. ISPRS J Photogramm Remote Sens 11
Chum O, Philbin J, Sivic J, Isard M, Zisserman A (2007) Total recall: automatic query expansion with a generative feature model for object retrieval. In: Proc. ICCV
Crandall D, Backstrom L, Huttenlocher D, Kleinberg J (2009) Mapping the world’s photos. In: WWW
Cui J, Liu Y, Xu Y, Zhao H, Zha H (2013) Tracking Generic Human Motion via Fusion of Low- and High-Dimensional Approaches. IEEE Trans Systems, Man, and Cybernetics: Systems 43(4):996–1002
Article Google Scholar
Fu Z, Sun X, Liu Q, Zhou L, Shu J (2015) Achieving Efficient Cloud Search Services: Multi-keyword Ranked Search over Encrypted Cloud Data Supporting Parallel Computing. IEICE Trans Commun E98-B(1):190–200
Article Google Scholar
Fukunaga K, Hostetler L (1975) The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans Inf Theory 21(l):31–40
MathSciNet MATH Google Scholar
Han J, Ngan K, Li M, Zhang H (2005) A memory learning framework for effective image retrieval. IEEE Trans Image Process 14(4):511–524
Article Google Scholar
Han J, Xu M, Li X, Guo L, Liu T (2014) Interactive Object-based Image Retrieval and Annotation on iPad. Multimedia Tools and Applications 72:2275–2297
Article Google Scholar
Hauff C, Houben G (2012a) Placing images on the world map: a microblog-based enrichment approach. In: SIGIR, pp 691–700
Hauff C, Houben G (2012b) Geo-location estimation of flickr images: social web based enrichment. In: ECIR ‘12, pp 85–96
Hays J, Efros A (2008) IM2GPS: estimating geographic information from a single image. In: CVPR
Jiang S, Qian X, Shen J, Fu Y, Mei T (2015) Travel Recommendation via Author Topic Model based Collaborative Filtering. IEEE Trans Multimedia 17(6):907–918
Google Scholar
Jiang S, Qian X, Fu Y, Mei T (2016) Personalized Travel Sequence Recommendation on Multi-Source Big Social Media. IEEE Trans Big Data 2(1):43–56
Article Google Scholar
Kinsella S, Murdock V, and OHare N (2011) I’m eating a sandwich in glasgow: Modeling locations with tweets. In: SMUC, pp 759–768
Lei X, Qian X, Zhao G (2016) Rating prediction based on social sentiment from textual reviews. IEEE Transactions on Multimedia 18(9):1910–1921
Article Google Scholar
Li Y, Crandall D, Huttenlocher D (2009) Landmark classification in large-scale Image Collections. ICCV
Li X, Hauff C, Larson M, Hanjalic A Preliminary exploration of the use of geographical information for content-based geo-tagging of social video. In: Proceedings of the MediaEval 2012 Workshop
Li J, Qian X, Tang Y, Yang L, Mei T (2013a) GPS estimation for places of interest from social users’ uploaded photos. IEEE Trans. Multimedia 15(8):2058–2071
Article Google Scholar
Li J, Qian X, Tang Y, Yang L, Liu C (2013b) GPS estimation from users’ photos. In: Proc. MMM, pp 118–129
Li J, Li X, Yang B, Sun X (2015a) Segmentation-based Image Copy-move Forgery Detection Scheme. IEEE Transactions on Information Forensics and Security 10(3):507–518
Article Google Scholar
Li J, Qian X, Li Q, Zhao Y, Wang L, Tang Y (2015b) Mining near duplicate image groups. Multimedia Tools and Applications 74(2):655–669
Article Google Scholar
Li J, Qian X et al (2015c) Improved Image GPS Location Estimation by Mining Salient Features. Signal Process Image Commun 38:141–150
Article Google Scholar
Liu Y, Zhang X, Cui J (2010) Visual analysis of child-adult interactive behaviors in video sequences. International Conference on Virtual Systems & Multimedia
Liu H, Mei T, Luo J, Li H, Li S (2012a) Finding perfect rendezvous on the go: accurate mobile visual localization and its applications to routing. In: ACM Multimedia, pp 9–18
Liu Y, Cui J, Zhao H, Zha H (2012b) Fusion of low-and high-dimensional approaches by trackers sampling for generic human motion tracking. ICPR, pp 898–901
Liu X, Qian X, Lu D, Hou X, Wang L (2014) Personalized tag recommendation for flickr users. In: Proc. ICME, pp 1–6
Liu Y, Nie L, Han L, Zhang L, Rosenblum S (2015) Action2Activity, “Recognizing complex activities from sensor data,” IJCAI, pp 1617–1623
Liu Y, Zheng Y, Liang Y, Liu S, Rosenblum S (2016a) Urban water quality prediction based on multi-task multi-view learning. IJCAI, pp 2576–2581
Liu Y, Zhang L, Nie L, Yan Y, Rosenblum S (2016b) Fortune teller, “Predicting your career path,” AAAI, pp 201–207
Liu Y, Liang Y, Liu S, Rosenblum S, Zheng Y (2016c) Predicting urban water quality with ubiquitous data. CoRR abs/1610.09462
Liu Y, Nie L, Liu L, Rosenblum S (2016d) From action to activity, “Sensor-based activity recognition”. Neurocomputing 181:108–115
Article Google Scholar
Liu L, Cheng L, Liu Y, Jia Y, Rosenblum D (2016e) Recognizing complex activities by a probabilistic interval-based model. AAAI, pp 1266–1272
Lu X, Pang Y, Hao Q, Zhang L (2009) Visualizing textual travelogue with location-relevant images. In: LBSN, pp 65–68
Lu D, Liu X, Qian X (2016) Tag-based image search by social re-ranking. IEEE Transactions on Multimedia 18(8):1628–1639
Article Google Scholar
Lu Y, Wei Y, Liu L, Zhong J, Sun L, Liu Y (2017) Towards unsupervised physical activity recognition using smartphone accelerometers. Multimedia Tools Appl 76(8):10701–10719
Article Google Scholar
Ma T, Zhou J, Tang M, Tian Y, Al-Dhelaan A, Al-Rodhaan M, Lee S (2015) Social network and tag sources based augmenting collaborative recommender system. IEICE Trans Inf Syst E98-D(4):902–910
Article Google Scholar
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Computer Vision 42(3):145–175
Article MATH Google Scholar
Park M, Luo J, Collins R, Liu Y (2010) Beyond GPS: determining the camera viewing direction of a geo-tagged image. MM
Preotiuc-Pietro D, Liu Y, Hopkins D, Ungar L (2017) Beyond binary labels: political ideology prediction of twitter users, ACL
Qian X, Liu X, Zheng C, Du Y, Hou X (2013) Tagging Photos Using Users’ Vocabularies. Neurocomputing 111:144–153
Article Google Scholar
Qian X, Zhao Y, Han J (2015a) Image Location Estimation by Salient Region Matching. IEEE Trans. Image Processing 24(6):4348–4358
Article MathSciNet Google Scholar
Qian X, Xue Y, Tang Y, Hou X, Mei T (2015b) Landmark Summarization with Diverse Viewpoints. IEEE Trans Circuits and Systems for Video Technology 25(11):1857–1869
Article Google Scholar
Qian X, Wang H, Zhao Y, Hou X, Hong R, Wang M, Tang Y (2017a) Image Location Inference by Multisaliency Enhancement. IEEE Trans. Multimedia 19(4):813–821
Article Google Scholar
Qian X, Lu D, Wang Y, Zhu L, Tang Y, Wang M (2017b) Image Re-ranking based on Topic Diversity. IEEE Trans Image Process 26(8):3734–3747
Article MathSciNet Google Scholar
Schindler G, Brown M, Szeliski R (2007) City-scale location recognition. In CVPR
Serdyukov P, Murdock V, van Zwol R (2009) Placing flickr photos on a map. In: SIGIR, pp 484–491
Trevisiol M, Delhumeau J, Jégou H, Gravier G (2012) How INRIA/IRISA identifies geographic location of videos. In: Proceedings of MediaEval
Wing B, Baldridge J (2011) Simple supervised document geolocation with geodesic grids. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol.1, pp 955–964
Xia Z, Wang X, Zhang L, Qin Z, Sun X, Ren K (2016a) A Privacy-preserving and Copy-deterrence Content-based Image Retrieval Scheme in Cloud Computing. IEEE Transactions on Information Forensics and Security 11(11):2594–2608
Article Google Scholar
Xia Z, Wang X, Sun X, Liu Q, Xiong N (2016b) Steganalysis of LSB matching using differences between nonadjacent pixels. Multimedia Tools and Applications 75(4):1947–1962
Article Google Scholar
Xiao X, Xu C, Wang J, Xu M (2012) Enhanced 3-d modeling for landmark image classification. IEEE Transactions on Multimedia 14(4):1246–1258
Article Google Scholar
Yang X, Qian X, Xue Y (2015) Scalable Mobile Image Retrieval by Exploring Contextual Saliency. IEEE Trans Image Processing 24(6):1709–1721
Article MathSciNet Google Scholar
Yuan C, Xia Z, Sun X (2017) Coverless Image Steganography Based on SIFT and BOF. Journal of Internet Technology 18(2):435–442
Google Scholar
Zha Z, Wang M, Zheng Y, Yang Y, Hong R, Chua T (2012) Interactive Video Indexing With Statistical Active Learning. IEEE Transactions on Multimedia 14(1):17–27
Article Google Scholar
Zhai C, Lafferty J (2002) Two-stage language models for information retrieval. In: SIGIR, pp 49–56
Zhang S, Huang Q, Hua G, Jiang S, Gao W, Tian Q (2010) Building contextual visual vocabulary for large-scale image applications. MM
Zhao G, Qian X, Mei T (2017) Service Rating Prediction by Exploring Social Mobile Users’ Geographic Locations. IEEE Trans. Big Data 3(1):67–78
Article Google Scholar
Zheng Y, Jeon B, Xu D, Wu QMJ, Zhang H (2015) Image segmentation by generalized hierarchical fuzzy C-means algorithm. Journal of Intelligent and Fuzzy Systems 28(2):961–973
Google Scholar
Zhou W, Lu Y, Li H, Song Y, Tian Q (2010) Spatial coding for large scale partial-duplicate web image search. MM
Zhou Z, Yang C, Chen B, Sun X, Liu Q, Wu Q (2016) Effective and Efficient Image Copy Detection with Resistance to Arbitrary Rotation. IEICE Trans Inf Syst E99-D(6):1531–1540
Article Google Scholar
Zhou Z, Wang Y, Wu Q, Yang C, Sun X (2017) Effective and Efficient Global Context Verification for Image Copy Detection. IEEE Transactions on Information Forensics and Security 12(1):48–63
Article Google Scholar

Download references

Acknowledgements

This work is partly supported by the NSFC under 61572083 and 61771075, the China Fundamental Research Funds for the Central Universities under Grant 310824153508 and 310824173401 (Chang’an University), and Foundation of Guangdong Province under Grant 2016A010101005.

Author information

Authors and Affiliations

School of Information Engineering, Chang’an University, Xi’an, China
Xiang Ma
The Ministry of Education Key Laboratory for Intelligent Networks and Network Security and with Xi’an Jiaotong University, Xi’an, 710049, China
Yisi Zhao & Xueming Qian
Research Institute of Xi’an Jiao Tong University, Shunde, Guangdong, China
Xueming Qian
Macau University, Macau, China
Yuan Yan Tang

Authors

Xiang Ma
View author publications
You can also search for this author in PubMed Google Scholar
Yisi Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Xueming Qian
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Yan Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiang Ma.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ma, X., Zhao, Y., Qian, X. et al. Multi-source fusion based geo-tagging for web images. Multimed Tools Appl 77, 16399–16417 (2018). https://doi.org/10.1007/s11042-017-5211-y

Download citation

Received: 26 October 2016
Revised: 15 August 2017
Accepted: 06 September 2017
Published: 17 September 2017
Issue Date: July 2018
DOI: https://doi.org/10.1007/s11042-017-5211-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-source fusion based geo-tagging for web images

Abstract

Access this article

Similar content being viewed by others

Image Matching from Handcrafted to Deep Features: A Survey

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

CLIP-Adapter: Better Vision-Language Models with Feature Adapters

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-source fusion based geo-tagging for web images

Abstract

Access this article

Similar content being viewed by others

Image Matching from Handcrafted to Deep Features: A Survey

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

CLIP-Adapter: Better Vision-Language Models with Feature Adapters

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation