Skip to main content

Resolving Coordinate Structures for Chinese Constituent Parsing

  • Conference paper
  • First Online:
Natural Language Processing and Chinese Computing (NLPCC 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9362))

  • 2283 Accesses

Abstract

Coordinate structures are linguistic structures consisting of two or more conjuncts, which usually compose into larger constituent as a whole unit. However, the boundary of each conjunct is difficult to identify, which makes it difficult to parse the whole coordinate and larger structures. In labeled data, such as the Penn Chinese Tree Bank (CTB), coordinate structures are not labeled explicitly, which makes solving the problem more complicated. In this paper, we treat resolving coordinate structures as an independent sub-problem of parsing. We first define coordinate structures explicitly and design rules to extract the coordinate structures from labeled CTB data. Then a specifically designed grammar is proposed for automatic parsing of coordinate structures. We propose two groups of new features to better model coordinate structures in a shift-reduce parsing framework. Our approach can achieve a \(15\%\) improvement in F-1 score on resolving coordinate structures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Collins, M.: Discriminative training methods for hidden markov models: theory and experiments with perceptron algorithms. In: Proceedings of the ACL 2002 Conference on Empirical Methods in Natural Language Processing, vol. 10, pp. 1–8. Association for Computational Linguistics (2002)

    Google Scholar 

  2. Graff, D., Chen, K.: Chinese gigaword. LDC Catalog No.: LDC2003T09, ISBN 1, 58563–58230 (2005)

    Google Scholar 

  3. Hara, K., Shimbo, M., Okuma, H., Matsumoto, Y.: Coordinate structure analysis with global structural constraints and alignment-based local features. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 2, pp. 967–975. Association for Computational Linguistics (2009)

    Google Scholar 

  4. Kummerfeld, J.K., Tse, D., Curran, J.R., Klein, D.: An empirical examination of challenges in chinese parsing. In: ACL (2), pp. 98–103 (2013)

    Google Scholar 

  5. Maier, W., Kübler, S.: Are all commas equal? detecting coordination in the penn treebank. In: The Twelfth Workshop on Treebanks and Linguistic Theories (TLT 2012), p. 121 (2013)

    Google Scholar 

  6. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (2013). arXiv preprint arXiv:1301.3781

  7. Ng, D., Curran, J.R.: Identifying cascading errors using constraints in dependency parsing

    Google Scholar 

  8. Ogren, P.V.: Improving syntactic coordination resolution using language modeling. In: Proceedings of the NAACL HLT 2010 Student Research Workshop, pp. 1–6. Association for Computational Linguistics (2010)

    Google Scholar 

  9. Popel, M., Marecek, D., Stepánek, J., Zeman, D., Zabokrtskỳ, Z.: Coordination structures in dependency treebanks. In: ACL (1), pp. 517–527 (2013)

    Google Scholar 

  10. Xue, N., Xia, F., Huang, S., Kroch, A.: The bracketing guidelines for the penn chinese treebank (3.0) (2000)

    Google Scholar 

  11. Zhang, Y., Clark, S.: A tale of two parsers: investigating and combining graph-based and transition-based dependency parsing using beam-search. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 562–571. Association for Computational Linguistics (2008)

    Google Scholar 

  12. Zhang, Y., Clark, S.: Transition-based parsing of the chinese treebank using a global discriminative model. In: Proceedings of the 11th International Conference on Parsing Technologies, pp. 162–171. Association for Computational Linguistics (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shujian Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhou, Y., Huang, S., Dai, X., Chen, J. (2015). Resolving Coordinate Structures for Chinese Constituent Parsing. In: Li, J., Ji, H., Zhao, D., Feng, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2015. Lecture Notes in Computer Science(), vol 9362. Springer, Cham. https://doi.org/10.1007/978-3-319-25207-0_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25207-0_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25206-3

  • Online ISBN: 978-3-319-25207-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics