Skip to main content

Combining POS Tagging, Lucene Search and Similarity Metrics for Entity Linking

  • Conference paper
Book cover Web Information Systems Engineering – WISE 2013 (WISE 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8180))

Included in the following conference series:

Abstract

Entity linking is to detect proper nouns or concrete concepts (a.k.a mentions) from documents, and to map them to the corresponding entries in a given knowledge base. In this paper, we propose an entity linking framework POSLS consisting of three components: mention detection, candidate selection and entity disambiguation. First, we use part of speech tagging and English syntactic rules to detect mentions. We then choose candidates with Lucene search. Finally, we identify the best matchings with a similarity based disambiguation method. Experimental results show that our approach has an acceptable accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dredze, M., McNamee, P., Rao, D., Gerber, A., Finin, T.: Entity disambiguation for knowledge base population. In: COLING (2010)

    Google Scholar 

  2. MacKinnon, I., Vechtomova, O.: Improving complex interactive question answering with wikipedia anchor text. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 438–445. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  3. Toutanova, K., Klein, D., Manning, C.D., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: HLT-NAACL (2003)

    Google Scholar 

  4. Lucene Search, http://lucene.apache.org/

  5. Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate record detection: A survey. IEEE TKDE 19(1), 1–16 (2007)

    Google Scholar 

  6. Medelyan, O., Witten, I.H., Milne, D.: Topic indexing with wikipedia. In: The Wikipedia and AI Workshop at AAAI 2008 (2008)

    Google Scholar 

  7. Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: CIKM (2007)

    Google Scholar 

  8. Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: Dbpedia spotlight: shedding light on the web of documents. In: I-SEMANTICS (2011)

    Google Scholar 

  9. Carpenter, B., Baldwin, B.: Lingpipe (2008)

    Google Scholar 

  10. Zhang, W., Su, J., Tan, C.L., Wang, W.T.: Entity linking leveraging automatically generated annotation. In: COLING (2010)

    Google Scholar 

  11. Cucerzan, S.: Large-scale named entity disambiguation based on wikipedia data. In: EMNLP-CoNLL (2007)

    Google Scholar 

  12. Zhang, W., Su, J., Tan, C.L., Cao, Y., Lin, C.-Y.: A lazy learning model for entity linking using query-specific information. In: COLING (2012)

    Google Scholar 

  13. Han, X., Sun, L., Zhao, J.: Collective entity linking in web text: a graph-based method. In: SIGIR (2011)

    Google Scholar 

  14. Zheng, Z., Li, F., Huang, M., Zhu, X.: Learning to link entities with knowledge base. In: HLT-NAACL (2010)

    Google Scholar 

  15. Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics 19(2), 313–330 (1993)

    Google Scholar 

  16. Kortmann, B., Schneider, E.W., Burridge, K., Mesthrie, R., Upton, C.: A handbook of varieties of English A Multimedia Reference Tool. Phonology, vol. 1, Morphology and Syntax, vol. 2. Mouton de Gruyter (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhao, S., Li, C., Ma, S., Ma, T., Ma, D. (2013). Combining POS Tagging, Lucene Search and Similarity Metrics for Entity Linking. In: Lin, X., Manolopoulos, Y., Srivastava, D., Huang, G. (eds) Web Information Systems Engineering – WISE 2013. WISE 2013. Lecture Notes in Computer Science, vol 8180. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41230-1_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41230-1_44

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41229-5

  • Online ISBN: 978-3-642-41230-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics