Abstract
In this paper we will present an approach to acquisition of some classes of compound words from large corpora, as well as a method for semi-automatic generation of appropriate linguistic models, that can be further used for compound word recognition and for completion of compound word dictionaries. The approach is intended for a highly inflective language such as Serbo-Croatian. Generated linguistic models are represented by local grammars.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Burnard, L. et al: TEI Lite: An Introduction to Text Encoding for Interchange, doc. No: TEI U 5, June 1995
Coates-Stephands, S.: The Analysis and Acquisition of Proper Names for Robust Text Understanding, PhD Thesis, Department of Computer Science,City University London, 1992
Gross M., Perrin D. (eds.): Electronic Dictionaries and Automata in Computational Linguistics, Lecture Notes in Computer Science, Berlin, Springer Verlag, 110 p., 1989
Gross M.: A Bootstrap Method for Construction Local Grammars, in Monograph on 125th anniversary of the Faculty of Mathematics, University of Belgrade, pp. 231–249, 1998
Maier-Meyer P., Oesterle J.: Recognition of Noun-Phrases in German, in Actes des Premieres Journees INTEX, LADL, 1996
Nenadić, G., Vitas, D.: Using Local Grammars for Agreement Modeling in Highly Inflective Languages, in Proc. of First Workshop on Text, Speech, Dialogue-TSD 98, Brno, 1998
Nenadić, G., Vitas, D.: Formal Model of Noun Phrases in Serbo-Croatian, BULAG 23, Universite Franche-Compte, 1998
Nenadić G., Spasić I.: The Acquisition of Some Lexical Constraints from Corpora, in Text, Speech and Dialogue-TSD’ 99, Lecture Notes in Artificial Intelligence 1692, Berlin, Springer Verlag, 1999
Silberztein, M: INTEX: a Corpus Processing System, in Proc. of COLING 94, ACL, Tokyo, 1994
Silberztein, M.: Dictionnaries électroniques et analyse automatique de textes: le systéme INTEX, Masson, Paris, 1993
Spasić I.: Automatic Foreign Words Recognition in a Serbo-Croatian Scientific and Technical Texts, in Proc. of Conference on ”Terminology Standardization”, Serbian Academy of Arts and Sciences, 1996 (in Serbo-Croatian)
Spasić I.: Natural Language Interface towards Relational Databases, MSc thesis, Faculty of Mathematics, University of Belgrade, 1999 (in Serbo-Croatian)
Spasić I., Pavlović-Lažetić G.: Syntactic Structures in a Sublanguage of Serbian for Querying Relational Databases, in Proc. of Third European Conference on Formal Description of Slavic Languages FDSL-3, 1999
Vitas, D.: Mathematical Model of Serbo-Croatian Morphology (Nominal Inflection), PhD thesis, Faculty of Mathematics, University of Belgrade, 1993 (in Serbo-Croatian)
Vitas D., Krstev C.: Tuning the Text with an Electronic Dictionary, in Proc. of COMPLEX 96, Budapest, Hungarian Academy of Sciences, 1996
Wakao T., Gaizauskas R., Wilks Y.: Evaluation of an Algorithm for the Recognition and Classification of Proper Names, in Proc. of the 16th International Conference on Computational Linguistics (COLING96), Copenhagen, pp. 418–423, 1996
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nenadić, G., Spasić, I. (2000). Recognition and Acquisition of Compound Names from Corpora. In: Christodoulakis, D.N. (eds) Natural Language Processing — NLP 2000. NLP 2000. Lecture Notes in Computer Science(), vol 1835. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45154-4_4
Download citation
DOI: https://doi.org/10.1007/3-540-45154-4_4
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67605-8
Online ISBN: 978-3-540-45154-9
eBook Packages: Springer Book Archive