Abstract
This chapter first analyzes the general relation between linguistic analysis and computational method. As a familiar example, automatic word form recognition is used. This example exhibits a number of properties which are methodologically characteristic for all components of grammar. We then show methods for investigating the frequency distribution of words in natural language.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
See O. Jespersen 1921, p. 341–346.
See K. Hess, J. Brustkern and W. Lenders 1983.
Cf. H. Bergenholtz 1989, D. Biber 1994, N. Oostdijk and P. de Haan (eds.) 1994.
The consequences of the tagset choice on the results of the corpus analysis are mentioned in S. Greenbaum and N. Yibin 1994, p. 34.
The use of HMMs for the grammatical tagging of corpora is described in, e.g., G. Leech, R. Garside and E. Atwell 1983, I. Marshall 1983, S. DeRose 1988, R. Sharman 1990, P. Brown, V. Della Pietra, et al. 1991. See also K. Church and L.R. Mercer 1993.
Meanwhile, the tagged BNC-lists have been removed from the web.
Unfortunately, neither G. Leech 1995 nor L. Burnard 1995 specify what exactly constitutes an error in tagging the BNC. A new project to improve the tagger was started in June 1995, however. It is called The British National Corpus Tag Enhancement Project’ and its results were originally scheduled to be made available in September 1996.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Hausser, R. (2001). Corpus analysis. In: Foundations of Computational Linguistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-04337-0_16
Download citation
DOI: https://doi.org/10.1007/978-3-662-04337-0_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-07626-8
Online ISBN: 978-3-662-04337-0
eBook Packages: Springer Book Archive