Abstract
Entity linking is to detect proper nouns or concrete concepts (a.k.a mentions) from documents, and to map them to the corresponding entries in a given knowledge base. In this paper, we propose an entity linking framework POSLS consisting of three components: mention detection, candidate selection and entity disambiguation. First, we use part of speech tagging and English syntactic rules to detect mentions. We then choose candidates with Lucene search. Finally, we identify the best matchings with a similarity based disambiguation method. Experimental results show that our approach has an acceptable accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Dredze, M., McNamee, P., Rao, D., Gerber, A., Finin, T.: Entity disambiguation for knowledge base population. In: COLING (2010)
MacKinnon, I., Vechtomova, O.: Improving complex interactive question answering with wikipedia anchor text. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 438–445. Springer, Heidelberg (2008)
Toutanova, K., Klein, D., Manning, C.D., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: HLT-NAACL (2003)
Lucene Search, http://lucene.apache.org/
Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate record detection: A survey. IEEE TKDE 19(1), 1–16 (2007)
Medelyan, O., Witten, I.H., Milne, D.: Topic indexing with wikipedia. In: The Wikipedia and AI Workshop at AAAI 2008 (2008)
Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: CIKM (2007)
Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: Dbpedia spotlight: shedding light on the web of documents. In: I-SEMANTICS (2011)
Carpenter, B., Baldwin, B.: Lingpipe (2008)
Zhang, W., Su, J., Tan, C.L., Wang, W.T.: Entity linking leveraging automatically generated annotation. In: COLING (2010)
Cucerzan, S.: Large-scale named entity disambiguation based on wikipedia data. In: EMNLP-CoNLL (2007)
Zhang, W., Su, J., Tan, C.L., Cao, Y., Lin, C.-Y.: A lazy learning model for entity linking using query-specific information. In: COLING (2012)
Han, X., Sun, L., Zhao, J.: Collective entity linking in web text: a graph-based method. In: SIGIR (2011)
Zheng, Z., Li, F., Huang, M., Zhu, X.: Learning to link entities with knowledge base. In: HLT-NAACL (2010)
Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics 19(2), 313–330 (1993)
Kortmann, B., Schneider, E.W., Burridge, K., Mesthrie, R., Upton, C.: A handbook of varieties of English A Multimedia Reference Tool. Phonology, vol. 1, Morphology and Syntax, vol. 2. Mouton de Gruyter (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhao, S., Li, C., Ma, S., Ma, T., Ma, D. (2013). Combining POS Tagging, Lucene Search and Similarity Metrics for Entity Linking. In: Lin, X., Manolopoulos, Y., Srivastava, D., Huang, G. (eds) Web Information Systems Engineering – WISE 2013. WISE 2013. Lecture Notes in Computer Science, vol 8180. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41230-1_44
Download citation
DOI: https://doi.org/10.1007/978-3-642-41230-1_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41229-5
Online ISBN: 978-3-642-41230-1
eBook Packages: Computer ScienceComputer Science (R0)