Abstract
As we have seen in the previous chapter, we can use rules to select an appropriate tag for each token. We will continue investigating the use of rules in this chapter. However, where in the previous chapter the rules were created manually, based on someone’s linguistic knowledge and familiarity with properties of the corpus, we will explore the possibility of learning tagging rules automatically. A potential advantage of automatic rule learning is that such a system could in theory be highly portable, both across domains and across languages. If training material is available, the systems can be retrained with little or no human intervention. A limitation of this approach is that such systems can only learn facts that can be described within the prespecified descriptive language of the learner, which limits the types of rules that can be learned. For example, a person might discover that a word tends to be tagged with one particular tag when it is toward the end of a sentence. If the learner did not have access to the concept of sentence length and position in a sentence, discovering such a heuristic rule would be beyond the capability of the learning algorithm. One thing that differentiates this approach from other machine learning approaches such as training neural networks (cf. Chapter 17) or hidden Markov models (HMMs; cf. Chapter 16) is that the learned information will be in a form suitable for people to understand, edit, improve, etc., just as is the case for manually written rules.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Brill, E. (1999). Corpus-Based Rules. In: van Halteren, H. (eds) Syntactic Wordclass Tagging. Text, Speech and Language Technology, vol 9. Springer, Dordrecht. https://doi.org/10.1007/978-94-015-9273-4_15
Download citation
DOI: https://doi.org/10.1007/978-94-015-9273-4_15
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-5296-4
Online ISBN: 978-94-015-9273-4
eBook Packages: Springer Book Archive