Advertisement

Using Generic Ontologies to Infer the Geographic Focus of Text

  • Christos RodosthenousEmail author
  • Loizos Michael
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11352)

Abstract

Certain documents are naturally associated with a country as their geographic focus. Some past work has sought to develop systems that identify this focus, under the assumption that the target country is explicitly mentioned in the document. When this assumption is not met, the task becomes one of inferring the focus based on the available context provided by the document. Although some existing work has considered this variant of the task, that work typically relies on the use of specialized geographic resources. In this work we seek to demonstrate that this inference task can be tackled by using generic ontologies, like ConceptNet and YAGO, that have been developed independently of the particular task. We describe GeoMantis, our developed system for inferring the geographic focus of a document, and we undertake a comparative evaluation against two freely-available open-source systems. Our results show that GeoMantis performs better than these two systems when the comparison is made on news stories whose target country is either not explicitly mentioned, or has been artificially obscured, in the story text.

Keywords

Information retrieval Geographic focus identification Ontologies Natural language processing Geographic information systems 

References

  1. 1.
    Tversky, B.: Cognitive maps, cognitive collages, and spatial mental models. In: Frank, A.U., Campari, I. (eds.) COSIT 1993. LNCS, vol. 716, pp. 14–24. Springer, Heidelberg (1993).  https://doi.org/10.1007/3-540-57207-4_2CrossRefGoogle Scholar
  2. 2.
    Silva, M.J., Martins, B., Chaves, M., Afonso, A.P., Cardoso, N.: Adding geographic scopes to web resources. Comput. Environ. Urban Syst. 30(4), 378–399 (2006)CrossRefGoogle Scholar
  3. 3.
    Bower, G.H.: Experiments on story understanding and recall. Q. J. Exp. Psychol. 28(4), 511–534 (1976)CrossRefGoogle Scholar
  4. 4.
    Andogah, G., Bouma, G., Nerbonne, J.: Every document has a geographical scope. Data Knowl. Eng. 81–82, 1–20 (2012)CrossRefGoogle Scholar
  5. 5.
    Leidner, J.L., Lieberman, M.D.: Detecting Geographical References in the Form of Place Names and Associated Spatial Natural Language. SIGSPATIAL Special 3, 5–11 (2011)CrossRefGoogle Scholar
  6. 6.
    Melo, F., Martins, B.: Automated geocoding of textual documents: a survey of current approaches. Trans. GIS 21(1), 3–38 (2016)CrossRefGoogle Scholar
  7. 7.
    Monteiro, B.R., Davis, C.A., Fonseca, F.: A survey on the geographic scope of textual documents. Comput. Geosci. 96, 23–34 (2016)CrossRefGoogle Scholar
  8. 8.
    Woodruff, A.G., Plaunt, C.: GIPSY: georeferenced information processing system. J. Am. Soc. Inf. Sci. 45, 645–655 (1994)CrossRefGoogle Scholar
  9. 9.
    Amitay, E., Har’El, N., Sivan, R., Soffer, A.: Web-a-where: geotagging web content. In: Proceedings of the 27th Annual International ACM SIGIR Conferenceon Research and Development in Information Retrieval, pp. 273–280 (2004)Google Scholar
  10. 10.
    Zubizarreta, Á., et al.: Extracting geographic context from the web: georeferencing in mymose. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 554–561. Springer, Heidelberg (2009).  https://doi.org/10.1007/978-3-642-00958-7_50CrossRefGoogle Scholar
  11. 11.
    D’Ignazio, C., Bhargava, R., Zuckerman, E., Beck, L.: CLIFF-CLAVIN: determining geographic focus for news articles. In: Proceedings of the NewsKDD: Data Science for News Publishing (2014)Google Scholar
  12. 12.
    Purves, R.S., et al.: The design and implementation of SPIRIT: a spatially aware search engine for information retrieval on the internet. Int. J. Geogr. Inf. Sci. 21(7), 717–745 (2007)CrossRefGoogle Scholar
  13. 13.
    Yu, J.: Geotagging named entities in news and online documents. In: Proceedings of the 25th ACM International Conference on Information and Knowledge Management, pp. 1321–1330 (2016)Google Scholar
  14. 14.
    Teitler, B.E., Lieberman, M.D., Panozzo, D., Sankaranarayanan, J., Samet, H., Sperling, J.: NewsStand: a new view on news. In: Proceedings of the 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 1–18 (2008)Google Scholar
  15. 15.
    de Alencar, R.O., Davis Jr., C.A.: Geotagging aided by topic detection with wikipedia. In: Geertman, S., Reinhardt, W., Toppen, F. (eds.) Advancing Geoinformation Science for a Changing World. Lecture Notes in Geoinformation and Cartography, vol. 1, pp. 461–477. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-19789-5_23CrossRefGoogle Scholar
  16. 16.
    Quercini, G., Samet, H., Sankaranarayanan, J., Lieberman, M.D.: Determining the spatial reader scopes of news sources using local lexicons. In: Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems - GIS 2010, pp. 43–52 (2010)Google Scholar
  17. 17.
    Watanabe, K.: Newsmap. Digit. Journal. 6(3), 294–309 (2018)CrossRefGoogle Scholar
  18. 18.
    Imani, M.B., Chandra, S., Ma, S., Khan, L., Thuraisingham, B.: Focus location extraction from political news reports with bias correction. In: 2017 IEEE International Conference on Big Data (Big Data), pp. 1956–1964 (2017)Google Scholar
  19. 19.
    Brun, G., Dominguès, C., Paris-est, U.: TEXTOMAP: determining geographical window for texts. In: Proceedings of the 9th Workshop on Geographic Information Retrieval, GIR 2015, pp. 7–8. ACM, New York (2015)Google Scholar
  20. 20.
    Halterman, A.: Mordecai: full text geoparsing and event geocoding. J. Open Source Softw. 2(9), 91 (2017)CrossRefGoogle Scholar
  21. 21.
    Lassila, O., Swick, R.R.: Resource Description Framework (RDF) Model and Syntax Specification. W3C Recommendation (1999)Google Scholar
  22. 22.
    Hayes, P., McBride, B.: RDF Semantics. W3C Recommendation. World Wide Web Consortium (2004)Google Scholar
  23. 23.
    Rodosthenous, C.T., Michael, L.: GeoMantis: inferring the geographic focus of text using knowledge bases. In: Proceedings of the 10th International Conference on Agents and Artificial Intelligence, ICAART, INSTICC, vol. 2, pp. 111–121. SciTePress (2018)Google Scholar
  24. 24.
    Manning, C.D., Bauer, J., Finkel, J., Bethard, S.J., Surdeanu, M., McClosky, D.: The stanford CoreNLP natural language processing toolkit. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60 (2014)Google Scholar
  25. 25.
    Quilitz, B., Leser, U.: Querying distributed RDF data sources with SPARQL. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 524–538. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-68234-9_39CrossRefGoogle Scholar
  26. 26.
    Manning, C.D., Raghavan, P., Schütze, H.: An Introduction to Information Retrieval, vol. 1. Cambridge University Press, Cambridge (2008)CrossRefGoogle Scholar
  27. 27.
    Speer, R., Havasi, C.: ConceptNet 5: a large semantic network for relational knowledge. In: Gurevych, I., Kim, J. (eds.) The Peoples Web Meets NLP. Theory and Applications of Natural Language Processing, pp. 161–176. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-35085-6_6CrossRefGoogle Scholar
  28. 28.
    Hoffart, J., Suchanek, F.M., Berberich, K., Lewis-kelham, E., Melo, G.D., Weikum, G.: YAGO2 : exploring and querying world knowledge in time, space, context, and many languages. In: Proceedings of the 20th International Conference on World WideWeb, pp. 229–232 (2011)Google Scholar
  29. 29.
    Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, pp. 697–706 (2007)Google Scholar
  30. 30.
    Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a large ontology from wikipedia and wordnet. Web Semant.: Sci. Serv. Agents World Wide Web 6(3), 203–217 (2008)CrossRefGoogle Scholar
  31. 31.
    von Ahn, L., Dabbish, L.: Designing games with a purpose. Commun. ACM 51(8), 57 (2008)CrossRefGoogle Scholar
  32. 32.
    Najmi, E., Malik, Z., Hashmi, K., Rezgui, A.: ConceptRDF: an RDF presentation of conceptnet knowledge base. In: 2016 7th International Conference on Information and Communication Systems (ICICS), pp. 145–150 (2016)Google Scholar
  33. 33.
    Ohlsson, S., Sloan, R.H., Turán, G., Urasky, A.: Verbal IQ of a four-year old achieved by an AI System. In: Proceedings of the 17th AAAI Conference on Late-Breaking Developments in the Field of Artificial Intelligence, pp. 89–91 (2013)Google Scholar
  34. 34.
    Fellbaum, C.: WordNet. In: Poli, R., Healy, M., Kameas, A. (eds.) Theory and Applications of Ontology: Computer Applications, pp. 231–243. Springer, Netherlands (2010).  https://doi.org/10.1007/978-90-481-8847-5_10CrossRefGoogle Scholar
  35. 35.
    Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)Google Scholar
  36. 36.
    Sandhaus, E.: The New York Times Annotated Corpus LDC2008T19. DVD. Linguistic Data Consortium, Philadelphia (2008)Google Scholar
  37. 37.
    Hermann, K.M., et al.: Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems (NIPS 2015), vol. 28, pp. 1–13 (2015)Google Scholar
  38. 38.
    Dignum, V.: Responsible autonomy. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI2017), pp. 4698–4704 (2017)Google Scholar
  39. 39.
    Lieto, A., Radicioni, D.P.: From human to artificial cognition and back: new perspectives on cognitively inspired AI systems. Cogn. Syst. Res. 39, 1–3 (2016)CrossRefGoogle Scholar
  40. 40.
    Cristani, M., Tomazzoli, C.: A multimodal approach to relevance and pertinence of documents. In: Fujita, H., Ali, M., Selamat, A., Sasaki, J., Kurematsu, M. (eds.) IEA/AIE 2016. LNCS (LNAI), vol. 9799, pp. 157–168. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-42007-3_14CrossRefGoogle Scholar
  41. 41.
    Rodosthenous, C., Michael, L.: A hybrid approach to commonsense knowledge acquisition. In: Proceedings of the 8th European Starting AI Researcher Symposium, pp. 111–122 (2016)Google Scholar
  42. 42.
    Mitchell, T., et al.: Never-ending learning. In: AAAI Conference on Artificial Intelligence, pp. 2302–2310 (2015)Google Scholar
  43. 43.
    Lehmann, J., et al.: Others: DBpedia-a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web 6(2), 167–195 (2015)Google Scholar
  44. 44.
    Erxleben, F., Günther, M., Krötzsch, M., Mendez, J., Vrandečić, D.: Introducing wikidata to the linked data web. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 50–65. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-11964-9_4CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Open University of CyprusNicosiaCyprus

Personalised recommendations