Linguistically Informed Mining Lexical Semantic Relations from Wikipedia Structure

Piasecki, Maciej; Indyka-Piasecka, Agnieszka; Kurc, Roman

doi:10.1007/978-3-642-20039-7_30

Maciej Piasecki²²,
Agnieszka Indyka-Piasecka²² &
Roman Kurc²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6591))

Included in the following conference series:

Asian Conference on Intelligent Information and Database Systems

1056 Accesses
1 Citations

Abstract

A method of the extraction of the wordnet lexico-semantic relations from the Polish Wikipedia articles was proposed. The method is based on a set of hand-written set of lexico-morphosyntactic extraction patterns that were developed in less than one man-week of workload. Two kinds of patterns were proposed: processing encyclopaedia articles as text documents, and utilising the information about the structure of the Wikipedia article (including links). Two types of evaluation were applied: manual assessment of the extracted data and on the basis of the application of the extracted data as an additional knowledge source in automatic plWordNet expansion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ahn, D., Jijkoun, V., Mishne, G., Müller, K., de Rijke, M., Schlobach, S.: Using Wikipedia at the TREC QA Track. In: Proceedings of TREC (2004)
Google Scholar
Bunescu, R., Pasca, M.: Using Encyclopedic Knowledge for Named Entity Disambiguation. In: Proc. of the 11th Conf. of the European Chapter of ACL, pp. 9–16. ACL, Trento (2007)
Google Scholar
Fellbaum, C. (ed.): WordNet – An Electronic Lexical Database. The MIT Press, Cambridge (1998)
MATH Google Scholar
Gabrilovich, E., Markovitch, S.: Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge. In: Proc. of the 21st National Conference on AI and the 18th Innovative Applications of AI Conference. AAAI Press, Boston (2006)
Google Scholar
Gurevych, I., Müller, C., Zesch, T.: What to be? – Electronic Career Guidance Based on Semantic Relatedness. In: Proc. of the 45th Annual Meeting of ACL, Prague, Czech Republic, June 2007, pp. 1032–1039. ACL (2007)
Google Scholar
Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the Conference of the International Committee on Computational Linguistics, pp. 539–545. ACL, Nantes (1992)
Google Scholar
Nastase, V., Strube, M.: Decoding Wikipedia Categories for Knowledge Acquisition. In: Proc. of the 23rd AAAI Conf., Chicago, pp. 1219–1224 (2008)
Google Scholar
Nastase, V., Strube, M., Boerschinger, B., Zirn, C., Elghafari, A.: WikiNet: A Very Large Scale Multi-Lingual Concept Network. In: Proc. of LREC 2010, pp. 1015–1022 (2010)
Google Scholar
Piasecki, M.: Polish tagger TaKIPI: Rule based construction and optimisation. Task Quarterly 11(1-2), 151–167 (2007)
Google Scholar
Piasecki, M., Broda, B., Głąbska, M., Marcińczuk, M., Szpakowicz, S.: Semi-automatic expansion of polish wordnet based on activation-area attachment. In: Recent Advances in Intelligent Information Systems, pp. 247–260. EXIT (2009)
Google Scholar
Piasecki, M., Kurc, R., Broda, B.: Heterogeneous knowledge sources in graph-based expansion of the polish wordnet. In: ACIIDS 2011. LNCS (LNAI), vol. 6591, pp. 307–317. Springer, Heidelberg (2011)
Google Scholar
Piasecki, M., Radziszewski, A.: Morphosyntactic constraints in acquisition of linguistic knowledge for polish. In: Mykowiecka, A., Marciniak, M. (eds.) Aspects of Natural Language Processing (a festschrift for Prof. Leonard Bolc). LNCS, vol. 5070, pp. 163–190. Springer, Heidelberg (2009)
Chapter Google Scholar
Piasecki, M., Szpakowicz, S., Broda, B.: A Wordnet from the Ground Up. Oficyna Wydawnicza Politechniki Wrocławskiej, Wrocław (2009)
Google Scholar
Ponzetto, S.P., Strube, M.: Deriving a large scale taxonomy from Wikipedia. In: Proc. of the 22nd Conference of the Advacement of Artificial Intelligence, Vancouver B.C., Canada, July 22-26, pp. 1440–1445 (2007)
Google Scholar
Ruiz-Casado, M., Alfonseca, E., Castells, P.: Automatic assignment of Wikipedia encyclopedic entries to WordNet synsets. In: Szczepaniak, P.S., Kacprzyk, J., Niewiadomski, A. (eds.) AWIC 2005. LNCS (LNAI), vol. 3528, pp. 380–386. Springer, Heidelberg (2005)
Chapter Google Scholar
Zesch, T., Gurevych, I., Mühlhäuser, M.: Comparing Wikipedia and German Wordnet by Evaluating Semantic Relatedness on Multiple Datasets. In: Proc. of NAACL-HLT 2007, pp. 205–208. ACL (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Informatics, Wrocław University of Technology, Poland
Maciej Piasecki, Agnieszka Indyka-Piasecka & Roman Kurc

Authors

Maciej Piasecki
View author publications
You can also search for this author in PubMed Google Scholar
Agnieszka Indyka-Piasecka
View author publications
You can also search for this author in PubMed Google Scholar
Roman Kurc
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Wroclaw University of Technology, 50-370, Wroclaw, Poland
Ngoc Thanh Nguyen
Department of Computer Engineering, Yeungnam University, 712-749, Dae-Dong, Gyeungsan, Korea
Chong-Gun Kim
Institute of Informatics, Automation and Robotics, Wroclaw University of Technology, 50-370, Wrocław, Poland
Adam Janiak

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Piasecki, M., Indyka-Piasecka, A., Kurc, R. (2011). Linguistically Informed Mining Lexical Semantic Relations from Wikipedia Structure. In: Nguyen, N.T., Kim, CG., Janiak, A. (eds) Intelligent Information and Database Systems. ACIIDS 2011. Lecture Notes in Computer Science(), vol 6591. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20039-7_30

Download citation

DOI: https://doi.org/10.1007/978-3-642-20039-7_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20038-0
Online ISBN: 978-3-642-20039-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics