Abstract
Spelling variants of words or word sense ambiguity takes many costs in such processes as Data Integration, Information Searching, data preprocessing for Data Mining, and so on. It is useful to construct relations between a word or phrases and a representative name of the entity to meet these demands. To reduce the costs, this paper discusses how to automatically discover “sameAs” and “meaningOf” links from Japanese Wikipedia. In order to do so, we gathered relevant features such as IDF, string similarity, number of hypernym, and so on. We have identified the link-based score on salient features based on SVM results with 960,000 anchor link pairs. These case studies show us that our link discovery method goes well with more than 70 % precision/recall rate.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bøhn, C., Nørvåg, K.: Extracting named entities and synonyms from wikipedia. In: Advanced Information Networking and Applications (AINA), pp.1300–1307 (2010)
www-nishio.ist.osaka-u.ac.jp/Thesis/master/2009/michishita/thesis.pdf
Halpin, H., Hayes, P.J., McCusker, J.P., McGuinness, D.L., Thompson, H.S.: When owl:sameAs isn’t the same: an analysis of identity in linked data. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 305–320. Springer, Heidelberg (2010)
Bechhofer, S., van Harmelen, F., Hendler, J., Horrocks, I., McGuinness, D.L., Patel-Schneider, P.F., Stein, L.A.: OWL Web Ontology Language Reference (2004)
Yamada, I., Torisawa, K., Kazama, J., Kuroda, K., Murata, M., DeSaeger, S., Bond, F., Sumida, A., Hashimoto, C.: Hyponymy relation acquisition based on distributional similarity and hierarchical structure of wikipedia. Inf. Process. Soc. Jpn. 52, 3435–3447 (2011)
Hearst, M. A.: Automatic acquisition of hyponyms from large text corpora. In: 14th International Conference on Computational Linguistics, pp.539–545 (1992)
Tamagawa, S., Sakurai, S., Tejima, T., Morita, T., Izumi, N., Yamaguchi, T.: Learning a large scale of ontology from japanese wikipedia. In: Web Intelligence and Intelligent Agent Technology (WI-IAT), pp.279–286 (2010)
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.G.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007)
Hoffart, J., Suchanek, F., Berberich, K., Weikum, G.: YAGO2: a spatially and temporally enhanced knowledge base from wikipedia. Research Report MPI-I-2010-5-007, Max-Planck-Institut für Informatik (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Kagawa, K., Tamagawa, S., Yamaguchi, T. (2014). An Automatic sameAs Link Discovery from Wikipedia. In: Kim, W., Ding, Y., Kim, HG. (eds) Semantic Technology. JIST 2013. Lecture Notes in Computer Science(), vol 8388. Springer, Cham. https://doi.org/10.1007/978-3-319-06826-8_29
Download citation
DOI: https://doi.org/10.1007/978-3-319-06826-8_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06825-1
Online ISBN: 978-3-319-06826-8
eBook Packages: Computer ScienceComputer Science (R0)