Classification-Based Filtering of Semantic Relatedness in Hypernymy Extraction

Piasecki, Maciej; Szpakowicz, Stanisław; Marcińczuk, Michał; Broda, Bartosz

doi:10.1007/978-3-540-85287-2_38

Classification-Based Filtering of Semantic Relatedness in Hypernymy Extraction

Maciej Piasecki²,
Stanisław Szpakowicz^3,4,
Michał Marcińczuk² &
…
Bartosz Broda²

Conference paper

1457 Accesses
6 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5221))

Abstract

Manual construction of a wordnet can be facilitated by a system that suggests semantic relations acquired from corpora. Such systems tend to produce many wrong suggestions. We propose a method of filtering a raw list of noun pairs potentially linked by hypernymy, and test it on Polish. The method aims for good recall and sufficient precision. The classifiers work with complex features that give clues on the relation between the nouns. We apply a corpus-based measure of semantic relatedness enhanced with a Rank Weight Function. The evaluation is based on the data in Polish WordNet. The results compare favourably with similar methods applied to English, despite the small size of Polish WordNet.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Pantel, P., Pennacchiotti, M.: Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations. In: [19], pp. 113–120
Google Scholar
Hearst, M.A.: Automated Discovery of WordNet Relations. In: Fellbaum, C. (ed.) WordNet – An Electronic Lexical Database. MIT Press, Cambridge (1998)
Google Scholar
Derwojedowa, M., Piasecki, M., Szpakowicz, S., Zawisławska, M., Broda, B.: Words, Concepts and Relations in the Construction of Polish WordNet. In: Tanács, A., Csendes, D., Vincze, V., Fellbaum, C., Vossen, P. (eds.) Proc. Global WordNet Conference, Seged, Hungary, January 22-25 2008, pp. 162–177. University of Szeged (2008)
Google Scholar
Broda, B., Derwojedowa, M., Piasecki, M., Szpakowicz, S.: Corpus-based Semantic Relatedness for the Construction of Polish WordNet. In: Proc. 6th Language Resources and Evaluation Conference (LREC 2008) (to appear,2008)
Google Scholar
Piasecki, M., Szpakowicz, S., Broda, B.: Extended Similarity Test for the Evaluation of Semantic Similarity Functions. In: Vetulani, Z. (ed.) Proc. 3rd Language and Technology Conference, Poznań, Poland, Pozna, October 5-7, 2007, pp. 104–108. Wydawnictwo Poznańskie Sp. z o.o. (2007)
Google Scholar
Snow, R., Jurafsky, D., Ng, A.Y.: Learning syntactic patterns for automatic hypernym discovery. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems 17, Cambridge, MA, pp. 1297–1304. MIT Press, Cambridge (2005)
Google Scholar
Snow, R., Jurafsky, D., Ng., A.Y.: Semantic taxonomy induction from heterogenous evidence. In: [19]
Google Scholar
Kennedy, A.: Analysis and Construction of Noun Hypernym Hierarchies to Enhance Roget’s Thesaurus. Master’s thesis, School of Information Technology and Engineering, University of Ottawa (2006)
Google Scholar
Zhang, M., Zhang, J., Su, J.: Exploring syntactic features for relation extraction using a convolution tree kernel. In: Proc. Human Language Technology Conference of the NAACL, Main Conference, ACL, pp. 288–295 (2006)
Google Scholar
Caraballo, S., Charniak, E.: Determining the specificity of nouns from text. In: Proc. Joint SIGDAT conference on empirical methods in natural language processing (EMNLP) and very large corpora (VLC), pp. 63–70 (1999)
Google Scholar
Przepiórkowski, A.: The IPI PAN Corpus: Preliminary version. Institute of Computer Science PAS (2004)
Google Scholar
Weeds, J., Weir, D.: Co-occurrence retrieval: A flexible framework for lexical distributional similarity. Computational Linguistics 31(4), 439–475 (2005)
Article MATH Google Scholar
Ryu, P.M., Choi, K.S.: Taxonomy learning using term specificity and similarity. In: Proc. 2nd Workshop on Ontology Learning and Population ACL, Sydney, pp. 41–48 (2006)
Google Scholar
Weiss, D.: Korpus Rzeczpospolitej. Corpus of text from the online edtion of Rzeczypospolita (2008), http://www.cs.put.poznan.pl/dweiss/rzeczpospolita
Weka: Weka 3: Data Mining Software in Java (2008), http://www.cs.waikato.ac.nz/ml/weka/ .
Fellbaum, C. (ed.): WordNet – An Electronic Lexical Database. MIT Press, Cambridge (1998)
MATH Google Scholar
Agirre, E., Edmonds, P. (eds.): Word Sense Disambiguation: Algorithms and Applications. Springer, Heidelberg (2006)
Google Scholar
Sojka, P., Kopeček, I., Pala, K. (eds.): Proc. Text, Speech and Dialog 2006 Conference. LNCS (LNAI). Springer, Heidelberg (2006)
Google Scholar
ACL 2006, ed.: Proc. 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, The Association for Computer Linguistics (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Applied Informatics, Wrocław University of Technology, Poland
Maciej Piasecki, Michał Marcińczuk & Bartosz Broda
School of Information Technology and Engineering, University of Ottawa, Canada
Stanisław Szpakowicz
Institute of Computer Science, Polish Academy of Sciences, Canada
Stanisław Szpakowicz

Authors

Maciej Piasecki
View author publications
You can also search for this author in PubMed Google Scholar
Stanisław Szpakowicz
View author publications
You can also search for this author in PubMed Google Scholar
Michał Marcińczuk
View author publications
You can also search for this author in PubMed Google Scholar
Bartosz Broda
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Chalmers University of Technology, 41296, Göteborg, Sweden
Bengt Nordström & Aarne Ranta &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Piasecki, M., Szpakowicz, S., Marcińczuk, M., Broda, B. (2008). Classification-Based Filtering of Semantic Relatedness in Hypernymy Extraction. In: Nordström, B., Ranta, A. (eds) Advances in Natural Language Processing. GoTAL 2008. Lecture Notes in Computer Science(), vol 5221. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85287-2_38

Download citation

DOI: https://doi.org/10.1007/978-3-540-85287-2_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85286-5
Online ISBN: 978-3-540-85287-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics