Abstract
The extraction and the disambiguation of knowledge guided by textual resources on the web is a crucial process to advance the Web of Linked Data. The goal of our work is to semantically enrich raw data by linking the mentions of named entities in the text to the corresponding known entities in knowledge bases. In our approach multiple aspects are considered: the prior knowledge of an entity in Wikipedia (i.e. the keyphraseness and commonness features that can be precomputed by crawling the Wikipedia dump), a set of features extracted from the input text and from the knowledge base, along with the correlation/relevancy among the resources in Linked Data. More precisely, this work explores the collective ranking approach formalized as a weighted graph model, in which the mentions in the input text and the candidate entities from knowledge bases are linked using the local compatibility and the global relatedness measures. Experiments on the datasets of the Open Knowledge Extraction (OKE) challenge with different configurations of our approach in each phase of the linking pipeline reveal its optimum mode. We investigate the notion of semantic relatedness between two entities represented as sets of neighbours in Linked Open Data that relies on an associative retrieval algorithm, with consideration of common neighbourhood. This measure improves the performance of prior link-based models and outperforms the explicit inter-link relevancy measure among entities (mostly Wikipedia-centric). Thus, our approach is resilient to non-existent or sparse links among related entities.
This work has been founded by the French ANR national grant (ANR-13-LAB2-0001).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
English Wikipedia dump downloaded on 2015/07/02.
- 3.
- 4.
- 5.
- 6.
References
Adafre, S.F., de Rijke, M.: Discovering missing links in wikipedia. In: Proceedings of the 3rd International Workshop on Link Discovery, LinkKDD 2005, pp. 90–97. ACM, New York (2005). http://doi.acm.org/10.1145/1134271.1134284
Ceccarelli, D., Lucchese, C., Orlando, S., Perego, R., Trani, S.: Dexter: an open source framework for entity linking. In: Proceedings of the Sixth International Workshop on Exploiting Semantic Annotations in Information Retrieval, pp. 17–20 (2013)
Ceccarelli, D., Lucchese, C., Orlando, S., Perego, R., Trani, S.: Learning relatedness measures for entity linking. In: Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management, CIKM 2013, pp. 139–148 (2013)
Cheng, X., Roth, D.: Relational inference for wikification. In: EMNLP (2013). http://cogcomp.cs.illinois.edu/papers/ChengRo13.pdf
Cornolti, M., Ferragina, P., Ciaramita, M.: A framework for benchmarking entity-annotation systems. In: Proceedings of the 22nd International Conference on World Wide Web, WWW 2013, pp. 249–260 (2013)
Cucerzan, S.: Large-scale named entity disambiguation based on wikipedia data. In: EMNLP-CoNLL 2007, Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, Czech Republic, 28–30 June 2007, pp. 708–716 (2007)
Dalton, J., Dietz, L.: A neighborhood relevance model for entity linking. In: Proceedings of the 10th Conference on Open Research Areas in Information Retrieval, OAIR 2013, pp. 149–156 (2013)
Durrett, G., Klein, D.: A joint model for entity analysis: coreference, typing, and linking. In: Proceedings of the Transactions of the Association for Computational Linguistics (2014)
Erbs, N., Zesch, T., Gurevych, I.: Link discovery: a comprehensive analysis. In: 2011 Fifth IEEE International Conference on Semantic Computing (ICSC), pp. 83–86, September 2011
Fader, A., Soderland, S., Etzioni, O.: Scaling wikipedia-based named entity disambiguation to arbitrary web text. In: Proceedings of WIKIAI (2009)
Fernández, N., Arias Fisteus, J., Sánchez, L., López, G.: Identityrank: named entity disambiguation in the news domain. Expert Syst. Appl. 39(10), 9207–9221 (2012)
Ferragina, P., Scaiella, U.: Tagme: on-the-fly annotation of short text fragments (by wikipedia entities). In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM 2010, pp. 1625–1628 (2010)
Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: ACL, pp. 363–370 (2005)
García, N.F., Arias-Fisteus, J., Fernández, L.S., Martín, E.: Webtlab: A cooccurrence-based approach to KBP 2010 entity-linking task. In: Proceedings of the Third Text Analysis Conference, TAC 2010, Gaithersburg, Maryland, USA, 15–16 November 2010 (2010)
Guo, Z., Barbosa, D.: Robust entity linking via random walks. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, CIKM 2014, pp. 499–508. ACM (2014). http://doi.acm.org/10.1145/2661829.2661887
Han, X., Sun, L., Zhao, J.: Collective entity linking in web text: a graph-based method. In: Proceedings of the 34th international Conference on Research and Development in Information Retrieval, pp. 765–774 (2011)
Hellmann, S., Lehmann, J., Auer, S., Brümmer, M.: Integrating NLP using linked data. In: Alani, H., et al. (eds.) ISWC 2013, Part II. LNCS, vol. 8219, pp. 98–113. Springer, Heidelberg (2013)
Hoffart, J., Seufert, S., Nguyen, D.B., Theobald, M., Weikum, G.: Kore: keyphrase overlap relatedness for entity disambiguation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM 2012, pp. 545–554 (2012)
Hoffart, J., Yosef, M.A., Bordino, I., Fürstenau, H., Pinkal, M., Spaniol, M., Taneva, B., Thater, S., Weikum, G.: Robust disambiguation of named entities in text. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 782–792 (2011)
Kulkarni, S., Singh, A., Ramakrishnan, G., Chakrabarti, S.: Collective annotation of wikipedia entities in web text. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2009, pp. 457–466. ACM, New York (2009)
Marie, N., Gandon, F.L., Giboin, A., Palagi, É.: Exploratory search on topics through different perspectives with DBpedia. In: Proceedings of the 10th International Conference on Semantic Systems, SEMANTICS 2014, Leipzig, Germany, 4–5 September 2014, pp. 45–52 (2014). http://doi.acm.org/10.1145/2660517.2660518
Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: DBpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems, I-Semantics 2011, pp. 1–8. ACM, New York (2011). http://doi.acm.org/10.1145/2063518.2063519
Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, pp. 233–242 (2007)
Milne, D., Witten, I.H.: An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In: Proceeding of AAAI Workshop on Wikipedia and Artificial Intelligence: An Evolving Synergy, pp. 25–30, July 2008
Milne, D., Witten, I.H.: Learning to link with wikipedia. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 509–518 (2008)
Moro, A., Raganato, A., Navigli, R.: Entity linking meets word sense disambiguation: a unified approach. Trans. Assoc. Comput. Linguist. 2, 231–244 (2014)
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical Report 1999–66, Stanford InfoLab, November 1999. http://ilpubs.stanford.edu:8090/422/
Plu, J., Rizzo, G., Troncy, R.: A hybrid approach for entity recognition and linking. In: ESWC 2015, 12th European Semantic Web Conference, Open Extraction Challenge, Portoroz, Slovenia, 31 May-4 June 2015 (2015). http://www.eurecom.fr/publication/4613
Ponzetto, S.P., Strube, M.: Knowledge derived from wikipedia for computing semantic relatedness. J. Artif. Intell. Res. (JAIR) 30, 181–212 (2007)
Ratinov, L., Roth, D., Downey, D., Anderson, M.: Local and global algorithms for disambiguation to wikipedia. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT 2011, vol. 1, pp. 1375–1384 (2011)
Shen, W., Wang, J., Han, J.: Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng. 27(2), 443–460 (2015)
Singh, S., Subramanya, A., Pereira, F., McCallum, A.: Wikilinks: A large-scale cross-document coreference corpus labeled via links to Wikipedia. Technical report, UM-CS-2012-015 (2012)
Spitkovsky, V.I., Chang, A.X.: A cross-lingual dictionary for english wikipedia concepts. In: Chair, N.C.C., Choukri, K., Declerck, T., Doan, M.U., Maegaard, B., Mariani, J., Moreno, A., Odijk, J., Piperidis, S. (eds.) Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC 2012), Istanbul, Turkey, May 2012
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Nooralahzadeh, F., Lopez, C., Cabrio, E., Gandon, F., Segond, F. (2016). Adapting Semantic Spreading Activation to Entity Linking in Text. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds) Natural Language Processing and Information Systems. NLDB 2016. Lecture Notes in Computer Science(), vol 9612. Springer, Cham. https://doi.org/10.1007/978-3-319-41754-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-41754-7_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41753-0
Online ISBN: 978-3-319-41754-7
eBook Packages: Computer ScienceComputer Science (R0)