Skip to main content

Automatic Partial Parsing Rule Acquisition Using Decision Tree Induction

  • Conference paper
  • 1566 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3651))

Abstract

Partial parsing techniques try to recover syntactic information efficiently and reliably by sacrificing completeness and depth of analysis. One of the difficulties of partial parsing is finding a means to extract the grammar involved automatically. In this paper, we present a method for automatically extracting partial parsing rules from a tree-annotated corpus using decision tree induction. We define the partial parsing rules as those that can decide the structure of a substring in an input sentence deterministically. This decision can be considered as a classification; as such, for a substring in an input sentence, a proper structure is chosen among the structures occurred in the corpus. For the classification, we use decision tree induction, and induce partial parsing rules from the decision tree. The acquired grammar is similar to a phrase structure grammar, with contextual and lexical information, but it allows building structures of depth one or more. Our experiments showed that the proposed partial parser using the automatically extracted rules is not only accurate and efficient, but also achieves reasonable coverage for Korean.

This research was supported in part by the Ministry of Science and Technology, the Ministry of Culture and Tourism, and the Korea Science and Engineering Foundation in Korea.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abney, S.P.: Part-of-speech tagging and partial parsing. In: Corpus-Based Methods in Language and Speech. Kluwer Academic Publishers, Dordrecht (1996)

    Google Scholar 

  2. Abney, S.P.: Partial parsing via finite-state cascades. In: Proceedings of the ESSLLI 1996 Robust Parsing Workshop, pp. 8–15 (1996)

    Google Scholar 

  3. Aït-Mokhtar, S., Chanod, J.P.: Incremental finite-state parsing. In: Proceedings of Applied Natural Language Processing, pp. 72–79 (1997)

    Google Scholar 

  4. Argamon-Engelson, S., Dagan, I., Krymolowski, Y.: A memory-based approach to learning shallow natural language patterns. Journal of Experimental and Theoretical AI 11(3), 369–390 (1999)

    Article  Google Scholar 

  5. Black, E., Abney, S., Flickenger, D., Gdaniec, C., Grishman, R., Harrison, P., Hindle, D., Ingria, R., Jelinek, F., Klavans, J., Liberman, M., Marcus, M., Roukos, S., Santorini, B., Strzalkowski, T.: A procedure for quantitatively comparing the syntactic coverage of English grammars. In: Proceedings of the DARPA Speech and Natural Language Workshop, pp. 306–311 (1991)

    Google Scholar 

  6. Bod, R.: Enriching Linguistics with Statistics: Performance Models of Natural Language. Ph.D Thesis. University of Amsterdam (1995)

    Google Scholar 

  7. Cardie, C., Pierce, D.: Error-driven pruning of treebank grammars for base noun phrase identification. In: Proceedings of 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, pp. 218–224 (1998)

    Google Scholar 

  8. Déjean, H.: Learning rules and their exceptions. Journal of Machine Learning Research 2, 669–693 (2002)

    Article  MATH  Google Scholar 

  9. Hindle, D.: A parser for text corpora. In: Computational Approaches to the Lexicon, pp. 103–151. Oxford University, Oxford (1995)

    Google Scholar 

  10. Hobbs, J.R., Appelt, D., Bear, J., Israel, D., Kameyama, M., Stickel, M., Tyson, M.: Fastus: A cascaded finite-state transducer for extracting information from natural-language text. In: Finite-State Language Processing, pp. 383–406. The MIT Press, Cambridge (1997)

    Google Scholar 

  11. Lee, K.J.: Probabilistic Parsing of Korean based on Language-Specific Properties. Ph.D. Thesis. KAIST, Korea (1998)

    Google Scholar 

  12. Lee, K.J., Kim, G.C., Kim, J.H., Han, Y.S.: Restricted representation of phrase structure grammar for building a tree annotated corpus of Korean. Natural Language Engineering 3(2), 215–230 (1997)

    Article  Google Scholar 

  13. Muñoz, M., Punyakanok, V., Roth, D., Zimak, D.: A learning approach to shallow parsing. In: Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Copora, pp. 168–178 (1999)

    Google Scholar 

  14. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Francisco (1993)

    Google Scholar 

  15. Ramshaw, L.A., Marcus, M.P.: Text chunking using transformation-based learning. In: Proceedings of Third Wordkshop on Very Large Corpora, pp. 82–94 (1995)

    Google Scholar 

  16. van Rijsbergen, C.: Information Retrieval. Buttersworth (1975)

    Google Scholar 

  17. Tjong Kim Sang, E.F.: Memory-based shallow parsing. Journal of Machine Learning Research 2, 559–594 (2002)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Choi, MS., Lim, C.S., Choi, KS. (2005). Automatic Partial Parsing Rule Acquisition Using Decision Tree Induction. In: Dale, R., Wong, KF., Su, J., Kwong, O.Y. (eds) Natural Language Processing – IJCNLP 2005. IJCNLP 2005. Lecture Notes in Computer Science(), vol 3651. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11562214_13

Download citation

  • DOI: https://doi.org/10.1007/11562214_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29172-5

  • Online ISBN: 978-3-540-31724-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics