Abstract
This paper deals with semi-automatic extension of CroDeriV with verb valency frames. CroDeriV is a morphological database of Croatian verbs. In its present shape the database comprises 14 500 verbs in infinitive forms. Each verb in CroDeriV is segmented into lexical and derivational morphemes and verbs of the same root are mutually linked. In order to further enrich the CroDeriV with semantic and syntactic information, we have used the NooJ platform to recognize derivationally related verbs, find the verb frames and to speed up the sentence processing.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
CroDeriV resembles databases like CatVar for English and Uni-morph for Russian (http://courses.washington.edu/unimorph and http://clipdemos.umiacs.umd.edu/catvar).
- 2.
There are approximately 120 verbal compounds recorded in CroDeriV.
- 3.
A more detailed account is given in [6].
- 4.
It will take even longer if the new paradigm is needed to be defined in the NOF file.
- 5.
Where existing means previously built by the inflectional grammar of a dictionary entry.
- 6.
If we observe language as a living thing, it is quite expected that some new derivations will appear in time.
- 7.
In this paper, we will refer to the selection of main verbs used in this research as the root verb and they are marked as TYPE = “ROOT”, while the derivations of these main verbs are marked as TYPE = “NOVIx”. All the other types of verb that may appear in the sentence are only marked as an <VP> chunk with no additional attributes.
References
Agić, Ž., Tadić, M., Dovedan, Z.: Evaluating full lemmatization of croatian texts. In: Recent Advances in Intelligent Information Systems, pp. 175–184. Academic Publishing House EXIT, Warsaw (2009)
Anić, V.: Rječnik hrvatskoga jezika. Novi liber, Zagreb (2004)
Bekavac, B.; Šojat, K.: Syntactic patterns of verb definitions in Croatian WordNet. In: Vučković, K., Bekavac, B., Silberztein, M (eds.) International Conference on Formalising Natural Languages with NooJ. Selected Papers from the NooJ 2011, pp. 112–121. Cambridge Scholars Publishing, Newcastle (2011)
Ljubešić, N., Erjavec, T.: hrWaC and slWac: compiling Web Corpora for Croatian and slovene. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS (LNAI), vol. 6836, pp. 395–402. Springer, Heidelberg (2011). doi:10.1007/978-3-642-23538-2_50
Silberztein, M.: The NooJ Manual (2003). www.nooj4nlp.net
Šojat, K., Srebačić, M., Štefanec, V.: CroDeriV i morfološka raščlamba hrvatskoga glagola. Suvremena lingvistika 39(75), 75–96 (2013). Zagreb
Šojat, K., Srebačić, M., Pavelić, T., Tadić, M.: CroDeriV: a new resource for processing Croatian morphology. In: Calzolari, N., Choukri, K., Declerck, T., Loftsson, H., Maegaard, B., Mariani, J., Moreno, A., Odijk, J., Piperidis, S. (eds.) Proceedingx of the 9th International Conference on Language Resources and Evaluation, pp. 3366–3370. Reykjavik, Iceland (2014)
Šojat, K., Vučković, K., Tadić, M. Extracting verb valency frames with NooJ. In: Ben Hamadou, A., Mesfar, S., Silberztein, M. (eds.). International Conference and Workshop on Finite State Language Engineering, NooJ 2009, pp. 231–242. Centre de Publication Universitaire, Sfax, Tunisia (2010)
Šonje, J. (ed.): Rječnik hrvatskoga jezika. Leksikografski zavod. Miroslav Krleža & Školska knjiga, Zagreb (2000)
Tadić, M.: New version of the Croatian National Corpus. In: After Half a Century of Slavonic Natural Language Processing, pp. 199–209. Masaryk University, Brno (2009)
Vučković, K., Tadić, M., Bekavac, B.: Croatian language resources for NooJ. CIT J. Comput. Inf. Technol. 18, 295–301 (2010). Zagreb
Vučković, K.: Model parsera za hrvatski jezik. Ph.D. dissertation. Faculty of Humanities and Social Sciences, Zagreb (2009)
Vučković, K., Mikelić Preradović, N., Dovedan, Z.: Verb valency enhanced Croatian Lexicon. In: Judit, K., Silberztein, M., Varadi, T. (eds.): International Conference on Selected Papers from the NooJ 2008, pp. 52–59. Cambridge Scholars Publishing, Newcastle (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Šojat, K., Bekavac, B., Kocijan, K. (2016). Detection of Verb Frames with NooJ. In: Barone, L., Monteleone, M., Silberztein, M. (eds) Automatic Processing of Natural-Language Electronic Texts with NooJ. NooJ 2016. Communications in Computer and Information Science, vol 667. Springer, Cham. https://doi.org/10.1007/978-3-319-55002-2_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-55002-2_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-55001-5
Online ISBN: 978-3-319-55002-2
eBook Packages: Computer ScienceComputer Science (R0)