Abstract
In this paper we show how loglinear models can be used to cluster verbs based on their subcategorization preferences. We describe how the information about the phrases or clauses a verb goes with can be computationally learned from an automatically tagged corpus with 9,333,555 words. We will use loglinear modeling to describe the relation between the acquired counts for the part-of-speech tags co-occurring with the verbs on predetermined positions.Based on these results an unsupervised clustering algorithm will be proposed.
Work supported by JNICT Projects CORPUS (PLUS/C/LIN/805/93) and DIXIT (2/2.1/TIT/1670/95)
Work supported by PhD scholarship JNICT-PRAXIS XXI/BD/2909/94
Chapter PDF
Keywords
References
Alan Agresti. Categorical Data Analysis. John Wiley and Sons, 1990.
Ted Briscoe and John Carroll. Automatic extraction of subcategorization from corpora. In Proceedings of the 5th Conference on Applied Natural Language Processing (ANLP'97), 1997.
Michael R. Brent. From grammar to lexicon: Unsupervised learning of lexical syntax. Computacional Linguistics, 19(2):245–262, 1993.
Alexander Franz. Automatic Ambiguity Resolution in Natural Language Processing, volume 1171 of Lecture Notes in Artificial Intelligence. Springer, 1996.
M. J. R. Healy. GLIM: An Introduction. Clarendon Press, Oxford, 1988.
Cristopher Manning. Automatic acquisition of a large subcategorization dictionary from corpora. In Proceedings of the 31st Annual Meeting of ACL, pages 235–242, 1993.
Nuno C. Marques and José Gabriel Lopes. A neural network approach to part-of-speech tagging. In Proceedings of the Second Workshop on Computational Processing of Written and Spoken Portuguese, pages 1–9, Curitiba, Brazil, October 21–22 1996.
Nuno Miguel Cavalheiro Marques, Gabriel Pereira Lopes, and Carlos Agra Coelho. Using loglinear clustering for subcategorization identification. In Coling-ACL, submitted paper, Available at http:\\wwwssdi.di.fct.unl.pt\~nmm 1998.
Akira Ushioda, David Evans, Ted Gibson, and Alex Waibel. Estimation of verb subcategorization frame frequencies based on syntactic and multidimensional statistical analysis. In Harry Bunt and Masaru Tomita, editors, Recent Advances in Parsing Technology. Kluwer Academic Publishers, 1996.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Miguel Marques, N., Pereira Lopes, G., Agra Coelho, C. (1998). Learning verbal transitivity using loglinear models. In: Nédellec, C., Rouveirol, C. (eds) Machine Learning: ECML-98. ECML 1998. Lecture Notes in Computer Science, vol 1398. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0026667
Download citation
DOI: https://doi.org/10.1007/BFb0026667
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64417-0
Online ISBN: 978-3-540-69781-7
eBook Packages: Springer Book Archive