Skip to main content

An Automatic sameAs Link Discovery from Wikipedia

  • Conference paper
  • First Online:
Semantic Technology (JIST 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8388))

Included in the following conference series:

  • 989 Accesses

Abstract

Spelling variants of words or word sense ambiguity takes many costs in such processes as Data Integration, Information Searching, data preprocessing for Data Mining, and so on. It is useful to construct relations between a word or phrases and a representative name of the entity to meet these demands. To reduce the costs, this paper discusses how to automatically discover “sameAs” and “meaningOf” links from Japanese Wikipedia. In order to do so, we gathered relevant features such as IDF, string similarity, number of hypernym, and so on. We have identified the link-based score on salient features based on SVM results with 960,000 anchor link pairs. These case studies show us that our link discovery method goes well with more than 70 % precision/recall rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bøhn, C., Nørvåg, K.: Extracting named entities and synonyms from wikipedia. In: Advanced Information Networking and Applications (AINA), pp.1300–1307 (2010)

    Google Scholar 

  2. www-nishio.ist.osaka-u.ac.jp/Thesis/master/2009/michishita/thesis.pdf‎

    Google Scholar 

  3. Halpin, H., Hayes, P.J., McCusker, J.P., McGuinness, D.L., Thompson, H.S.: When owl:sameAs isn’t the same: an analysis of identity in linked data. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 305–320. Springer, Heidelberg (2010)

    Google Scholar 

  4. Bechhofer, S., van Harmelen, F., Hendler, J., Horrocks, I., McGuinness, D.L., Patel-Schneider, P.F., Stein, L.A.: OWL Web Ontology Language Reference (2004)

    Google Scholar 

  5. Yamada, I., Torisawa, K., Kazama, J., Kuroda, K., Murata, M., DeSaeger, S., Bond, F., Sumida, A., Hashimoto, C.: Hyponymy relation acquisition based on distributional similarity and hierarchical structure of wikipedia. Inf. Process. Soc. Jpn. 52, 3435–3447 (2011)

    Google Scholar 

  6. Hearst, M. A.: Automatic acquisition of hyponyms from large text corpora. In: 14th International Conference on Computational Linguistics, pp.539–545 (1992)

    Google Scholar 

  7. http://www.csie.ntu.edu.tw/~cjlin/libsvm/

  8. Tamagawa, S., Sakurai, S., Tejima, T., Morita, T., Izumi, N., Yamaguchi, T.: Learning a large scale of ontology from japanese wikipedia. In: Web Intelligence and Intelligent Agent Technology (WI-IAT), pp.279–286 (2010)

    Google Scholar 

  9. Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.G.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007)

    Google Scholar 

  10. Hoffart, J., Suchanek, F., Berberich, K., Weikum, G.: YAGO2: a spatially and temporally enhanced knowledge base from wikipedia. Research Report MPI-I-2010-5-007, Max-Planck-Institut für Informatik (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takahira Yamaguchi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Kagawa, K., Tamagawa, S., Yamaguchi, T. (2014). An Automatic sameAs Link Discovery from Wikipedia. In: Kim, W., Ding, Y., Kim, HG. (eds) Semantic Technology. JIST 2013. Lecture Notes in Computer Science(), vol 8388. Springer, Cham. https://doi.org/10.1007/978-3-319-06826-8_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-06826-8_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-06825-1

  • Online ISBN: 978-3-319-06826-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics