Skip to main content

Acquisition of Syntactical Knowledge from Text

  • Conference paper
Information and Classification

Part of the book series: Studies in Classification, Data Analysis and Knowledge Organization ((STUDIES CLASS))

  • 525 Accesses

Abstract

The outline of a system is described which is designed to infer a grammar from a finite sample of linguistic data (corpus). It is inspired by the research on inductive inference in the sense of Gold(1967). After tagging the corpus, an incremental learning algorithm is used to produce a sequence of grammars which approximates the target grammar of the data provided. In each step, a small set of sentences is selected in a way which reduces the danger of overgeneralization. The sentences selected are analysed by a modified Earley parser which allows to measure the “distance” between the language generated by the actual grammar G and sentences not covered by G. The sentence which minimizes the “inductive leap” for the learner is selected to infer a new grammar. For this sentence several hypotheses for completing its partial structural description are formulated and evaluated. The “best” hypothesis is then used to infer a new grammar. This process is continued until the corpus is completely covered by the grammar.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aho, A. V. & T. G. Peterson (1972), A minimum distance error-correcting parser for context-free languages, SIAM Journal on Computing, 1(4), 305–12.

    Article  Google Scholar 

  • Angluin, D. (1980), Inductive inference of formal languages from positive data, Information and Control, 45, 117–35.

    Article  Google Scholar 

  • Berwick, R.C. (1986), Learning from positive-only examples, in: R.S. Michalski, J.G. Carbonell & T. M. Mitchell (eds), Machine Learning-Vol.II, Morgan Kaufmann, Los Altos, 625–45.

    Google Scholar 

  • Crespi-Reghizzi, S. (1972), An effective model for grammar inference, in: B. Gilchrist (ed), Information Processing 71, Elsevier North-Holland, 524-29.

    Google Scholar 

  • Garside, R., G. Leech & G. Sampson (1987), The computational analysis of English, Longman, New York.

    Google Scholar 

  • Gold, E. M. (1967), Language identification in the limit, Information and Control, 10, 447–74.

    Article  Google Scholar 

  • Lyon, G. (1974), Syntax-directed least-errors analysis for context-free languages: A practical approach, Communications of the ACM, 17(1), 3–14.

    Article  Google Scholar 

  • Morgan, J.L. (1986), From simple input to complex grammar, The MIT Press, Cambridge, MA.

    Google Scholar 

  • Wagner, R. A. & J. L. Seiferas (1978), Correcting counter-automaton-recognizable languages, Siam Journal on Computing, 7(3), 357–75.

    Article  Google Scholar 

  • Yokomori, T. (1989), Learning context-free languages efficiently, in: K.P. Jantke (ed), Analogical and inductive inference, Springer, Berlin-Heidelberg, 104–23.

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1993 Springer-Verlag Berlin · Heidelberg

About this paper

Cite this paper

Schrepp, J. (1993). Acquisition of Syntactical Knowledge from Text. In: Opitz, O., Lausen, B., Klar, R. (eds) Information and Classification. Studies in Classification, Data Analysis and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-50974-2_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-50974-2_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-56736-3

  • Online ISBN: 978-3-642-50974-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics