Geotagging Aided by Topic Detection with Wikipedia

  • Rafael Odon de AlencarEmail author
  • Clodoveu Augusto Davis Jr
Part of the Lecture Notes in Geoinformation and Cartography book series (LNGC, volume 1)


It is known that geography-aware keyword queries correspond to a significant share of the users’ demand on search engines. This paper describes a strategy for tagging documents with place names according to the geographical context of their textual content by using a topic indexing technique that considers Wikipedia articles as a controlled vocabulary. By identifying those topics in the text, we connect documents with the Wikipedia semantic network of articles allowing us to perform operations on Wikipedia’s graph and find related places. We present an experimental evaluation on documents tagged as Brazilian states demonstrating the feasibility of our proposal and opening the way to further research geotagging based on semantic networks.


Semantic Network Brazilian State Anchor Text Topic Detection Geographic Entity 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Ahlers, D. and Boll, S. Retrieving address-based locations from the web. in Proceedings of the 2nd ACM Int. GIR Workshop. 2008. Napa Valley, CA, USA.Google Scholar
  2. Alencar, R. O., Davis Jr, C. A., Gonçalves, M. A. Geographical Classification of Documents Using Evidence from Wikipedia. In Proceedings of the 6th ACM Geographic Information Retrieval (GIR) Workshop. 2010. Zurich, Switzerland.Google Scholar
  3. Backstrom, L., Kleinberg, J., Kumar, R., and Novak, J. Spatial Variation in Search Engine Queries. In International World Wide Web Conference. 2008. Beijing, China.Google Scholar
  4. Blessing, A., Kuntz, R., and Schütze, H. Towards a context model driven German geo-tagging system. in Proceedings of the 4th ACM GIR Workshop. 2007. Lisbon, Portugal.Google Scholar
  5. Borges, K.A.V., Laender, A.H.F., Medeiros, C.B., and Davis Jr., C.A. Discovering Geographic Locations in Web Pages Using Urban Addresses. in Proceedings of the 4th ACM GIR Workshop. 2007. Lisbon, Portugal.Google Scholar
  6. Brin, S. and Page, L. The anatomy of a large hypertextual Web search engine. in Proceedings of the 7th International Conference on the World Wide Web. 1998. Brisbane, Australia.Google Scholar
  7. Buscaldi, D. and Rosso, P. A Comparison of Methods for the Automatic Identification of Locations in Wikipedia. in Proceedings of the 4th ACM GIR Workshop. 2007. Lisbon, Portugal.Google Scholar
  8. Buscaldi, D., Rosso, P., and Peris, P. Inferring Geographical Ontologies from Multiple Resources for Geographical Information Retrieval. In Proceedings of the 3rd ACM GIR Workshop. 2006. Seattle, WA, USA.Google Scholar
  9. Cardoso, N., Silva, M.J., and Santos, D. Handling implicit geographic evidence for geographic information retrieval. in Proceedings of the 17th ACM CIKM. 2008. Napa Valley, CA, USA.Google Scholar
  10. Davis Jr., C.A. and Fonseca, F.T., Assessing the Certainty of Locations Produced by an Address Geocoding System. Geoinformatica, 2007. 11(1): p. 103-129.CrossRefGoogle Scholar
  11. Delboni, T.M., Borges, K.A.V., Laender, A.H.F., and Davis Jr., C.A., Semantic Expansion of Geographic Web Queries Based on Natural Language Positioning Expressions. Transactions in GIS, 2007. 11(3): p. 377-397.CrossRefGoogle Scholar
  12. Himmelstein, H., Local Search: The Internet is the Yellow Pages. IEEE Computer, 2005. 38(2): p. 26-35.CrossRefGoogle Scholar
  13. Kasneci, G., Ramanath, M., Suchanek, F., and Weikum, G., The yago-naga approach to knowledge discovery. SIGMOD Record, 2008. 37(4): p. 41-47.CrossRefGoogle Scholar
  14. Mihalcea, R. and Csomai, A. Wikify! : linking documents to encyclopedic knowledge. In Proceedings of the 16th ACM CIKM. 2007. Lisbon, Portugal.Google Scholar
  15. Milne, D. and Witten, I.H. Learning to link with Wikipedia. in Proceedings of the 16th ACM CIKM. 2008. Napa Valley, CA, USA.Google Scholar
  16. Medelyan, O., Witten, I. H. and Milne D. Topic Index with Wikipedia in Proceedings of the AAAI 2008 Workshop on Wikipedia and Artificial Intelligence. Chicago, IL.Google Scholar
  17. Sanderson, M. and Han, Y. Search words and geography. in Proceedings of the 4th ACM GIR Workshop. 2007. Lisbon, Portugal.Google Scholar
  18. Schockaert, S., De Cock, M., and Kerre, E.E., Location approximation for local search services using natural language hints. International Journal of Geographic Information Science, 2008. 22(3): p. 315-336.CrossRefGoogle Scholar
  19. Silva, M.J., Martins, B., Chaves, M., Cardoso, N., and Afonso, A.P., Adding Geographic Scopes to Web Resources. Computers, Environment and Urban Syst., 2006. 30: p. 378-399.Google Scholar
  20. Wang, C., Xie, X., Wang, L., Lu, Y., and Ma, W. Detecting Geographic Locations from Web Resources. in Proc. of the 2nd ACM GIR Workshop. 2005.Google Scholar
  21. Wu, F., Weld, D. S. Autonomously semantifying wikipedia. Proceedings of the 16th ACM CIKM. 2007. Lisbon, Portugal.Google Scholar
  22. Zong, W., Wu, D., Sun, A., Lim, E., and Goh, D.H.G. On Assigning Place Names to Geographic Related Web Pages. In Proc. of the 5th ACM/IEEE-CS Joint Conf. on Digital Libraries. 2005. Denver, Colorado, USA.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Rafael Odon de Alencar
    • 1
    • 2
    Email author
  • Clodoveu Augusto Davis Jr
    • 1
  1. 1.Database Laboratory, Departamento de Ciência da ComputaçãoUniversidade Federal de Minas GeraisBelo HorizonteBrazil
  2. 2.Serviço Federal de Processamento de Dados (SERPRO)ManausBrazil

Personalised recommendations