Skip to main content

Matching Pattern Acquisition Approach for Ancient Chinese Treebank Construction

  • Conference paper
  • First Online:
  • 1691 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10709))

Abstract

Matching Pattern (MP) is a sequence of words or part-of-speech (POS), sampled from clauses, and MP acquisition is an effective approach for ancient Chinese treebank construction. This approach uses the typical characteristics of ancient Chinese short-clauses and strong-patterns, and lays down the syntactic annotation process of the treebank construction in three stages. These stages involve: (1) obtaining weighted MPs with a syntactic skeleton; (2) applying these MPs to match the clauses; and (3) generating syntactic structures of these clauses according to the syntactic skeleton of the MP. The syntactic skeletons are constructed based on the Sentence-based Grammar in our experiments. The MP-based parsing procedures are implemented on both clause and fragment units. Experiments on corpora extracted from Yili and Zuozhuan show that an integrated algorithm, involving both clause and fragment units, can achieve a performance of 99.07%/82.76% and 97.25%/77.77% for coverage/precision, respectively.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hu, X., Williamson, N., McLaughlin, J.: Sheffield corpus of chinese for diachronic linguistic study1. Literary and Linguistic Computing 20(3), 281–293 (2005)

    Google Scholar 

  2. Peng, W., He, J., Song, J.: The design and implement of diagrammatical sentence-based grammar parsing system. In: 4th International Conference of Digital Archives and Digital Humanities. Research Center for Digital Humanities, National Taiwan University (2012)

    Google Scholar 

  3. Peng, W., Song, J., Sui, Z., Guo, D.: Formal schema of diagrammatic chinese syntactic analysis. In: Workshop on Chinese Lexical Semantics. pp. 701–710. Springer (2015)

    Google Scholar 

  4. Peng, W., Song, J., Wang, N.: Issues on formalization of chinese syntactic analysis. Journal of Chinese Information Processing 30(3), 175–180 (2016)

    Google Scholar 

  5. Shi, M., Chen, X., Li, B.: Crf based research on a unified approach to word segmentation and pos tagging for pre-qin chinese. Journal of Chinese Information Processing 2(24), 39–45 (2010)

    Google Scholar 

  6. Song, J.h., Hu, J.j., Meng, P.s., Wang, N.: The construction of corpora in a classic-cotemporary chinese parallel corpus. Modern Educational Technology 1, 027 (2008)

    Google Scholar 

  7. Wei, P.c., Thompson, P., Liu, C.h., Huang, C.R., Sun, C.: Historical corpora for synchronic and diachronic linguistics studies. Computational Linguistics and Chinese Language Processing 2(1), 131–145 (1997)

    Google Scholar 

  8. Zhao, M., Peng, W., Song, J., Yang, T.: Development and optimization of syntax tagging tool on diagrammatic treebank. Journal of Chinese Information Processing 28(6), 26–33 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weiming Peng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

He, J., Song, T., Peng, W., Song, J. (2018). Matching Pattern Acquisition Approach for Ancient Chinese Treebank Construction. In: Wu, Y., Hong, JF., Su, Q. (eds) Chinese Lexical Semantics. CLSW 2017. Lecture Notes in Computer Science(), vol 10709. Springer, Cham. https://doi.org/10.1007/978-3-319-73573-3_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-73573-3_44

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-73572-6

  • Online ISBN: 978-3-319-73573-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics