Abstract
Entity linking (entity annotation) is the task of linking named entity mentions on Web pages with the entities of a knowledge base (KB). With the continued progress of information extraction and semantic search techniques, entity linking has received much attention in both research and industrial communities. The challenge of the task is mainly on entity disambiguation. To our best knowledge, the huge existing RDF KBs have not been fully exploited for entity linking. In this paper, we study the entity linking problem via the usage of RDF KBs. Besides the accuracy of entity linking, the scalability of handling huge Web corpus and large RDF KBs are also studied. The experimental results show that our solution on entity linking achieves not only very good accuracy but also good scalability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Guha, R.V., McCool, R., Miller, E.: Semantic search. In: WWW, pp. 700–709 (2003)
Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: A core of semantic knowledge unifying wordnet and wikipedia. In: WWW, pp. 697–706 (2007)
Bollacker, K.D., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: SIGMOD, pp. 1247–1250 (2008)
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.G.: DBpedia: A Nucleus for a Web of Open Data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ISWC/ASWC 2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007)
Linking Open Data, http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData
Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters. In: OSDI 2004 (2004)
Lesk, M.: Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone. In: SIGDOC, Toronto (June 1986)
Mihalcea, R.: Large vocabulary unsupervised word sense disambiguation with graph-based algorithms for sequence data labeling. In: Proceedings of the Human Language Technology/Empirical Methods in Natural Language Processing Conference, Vancouver (2005)
Navigli, R., Velardi, P.: Structural semantic interconnections: a knowledge-based approach to word sense disambiguation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 27 (2005)
Bunescu, R., Pasca, M.: Using Encyclopedic Knowledge for Named Entity Disambiguation. In: EACL, pp. 9–16 (2006)
Bagga, A., Baldwin, B.: Entity-based cross-document coreferencing using the vector space model. In: COLING, pp. 79–85 (1998)
Dredze, M., McNamee, P., Rao, D., Gerber, A., Finin, T.: Entity disambiguation for knowledge base population. In: COLING, pp. 277–285 (2010)
Hasegawa, T., Sekine, S., Grishman, R.: Discovering relations among named entities from large corpora. In: ACL, pp. 415–422 (2004)
Demartini, G., Difallah, D.E., Cudré-Mauroux, P.: ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In: WWW, pp. 469–478 (2012)
Stoyanov, V., Mayfield, J., Xu, T., Oard, D.W., Lawrie, D., Oates, T., Finnin, T.: A context aware approach to entity linking. In: The Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction, NAACL-HLT 2012 (2012)
Navigli, R., Velardi, P.: Structural semantic interconnections: a knowledge-based approach to word sense disambiguation. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 1075–1086 (2005)
Gliozzo, A., Giuliano, C., Strapparava, C.: Domain kernels for word sense disambiguation. In: ACL (2005)
Ng, H., Lee, H.: Integrating multiple knowledge sources to disambiguate word sense: An examplar-based approach. CoRR, vol. 9606032 (1996)
Weikum, G., Theobald, M.: From information to knowledge: harvesting entities and relationships from web sources. In: PODS, pp. 65–76 (2010)
Hoffart, J., Yosef, M.A., Bordino, I., Fürstenau, H., Pinkal, M., Spaniol, M., Taneva, B., Thater, S., Weikum, G.: Robust Disambiguation of Named Entities in Text. In: Proceedings of EMNLP, pp. 782–792 (2011)
Shen, W., Wang, J.Y., Luo, P., Wang, M.: LINDEN: linking named entities with knowledge base via semantic knowledge. In: WWW 2012, pp. 449–458 (2012)
Shen, W., Wang, J.Y., Luo, P., Wang, M.: LIEGE: Link Entities in Web Lists with Knowledge Base. In: Proceedins of KDD 2012 (2012)
Carlos, B.T., Guestrin, C., Koller, D.: Max-margin markov networks. In: NIPS (2003)
Nadeau, D., Turney, P.D., Matwin, S.: Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity. In: Lamontagne, L., Marchand, M. (eds.) Canadian AI 2006. LNCS (LNAI), vol. 4013, pp. 266–277. Springer, Heidelberg (2006)
Stanford NER, http://nlp.stanford.edu/ner/index.shtml
Hadoop, http://hadoop.apache.org/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Du, F., Chen, Y., Du, X. (2013). Linking Entities in Unstructured Texts with RDF Knowledge Bases. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds) Web Technologies and Applications. APWeb 2013. Lecture Notes in Computer Science, vol 7808. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37401-2_25
Download citation
DOI: https://doi.org/10.1007/978-3-642-37401-2_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37400-5
Online ISBN: 978-3-642-37401-2
eBook Packages: Computer ScienceComputer Science (R0)