Skip to main content

Corpus Annotation/Management Tools for the Project: Balanced Corpus of Contemporary Written Japanese

  • Conference paper
Large-Scale Knowledge Resources. Construction and Application (LKR 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4938))

Included in the following conference series:

  • 609 Accesses

Abstract

This paper introduces our activities on corpus annotation and management tool development in the Japanese government funded project, Balanced Corpus of Contemporary Written Japanese. We are investigating various levels of text annotation that covers morphological and POS tagging, syntactic dependency parsing, predicate-argument analysis, and coreference analysis. Since automatic annotation is not perfect, we need annotated corpus management tools that facilitate corpus browsing and error correction. We especially take up our corpus management tool ChaKi, explains its functions, and discuss how we are trying to maintain consistency of corpus annotation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Asahara, M., Matsumoto, Y.: Japanese named entity extraction with redundant morphological analysis. In: Proc. Human Language Technology and North American Chapter of Association for Computational Linguistics, pp. 8–15 (2003)

    Google Scholar 

  2. Iida, R., Inui, K., Matsumoto, Y.: Exploiting Syntactic Patterns as Clues in Zero-Anaphora Resolution. In: ACL-Coling-2006, pp. 625–632 (2006)

    Google Scholar 

  3. Kudo, T., Matsumoto, Y.: Japanese Dependency Analysis using Cascaded Chunking. In: 6th Conference on Natural Language Learning, pp. 63–69 (2002)

    Google Scholar 

  4. Maekawa, K.: KOTONOHA and BCCWJ: Development of a Balanced Corpus of Contemporary Written Japanese. In: Corpora and Language Research: Proceedings of the First International Conference on Korean Language, Literature, and Culture, pp. 158–177 (2007)

    Google Scholar 

  5. Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a Large Annotated Corpus of English: The Penn Treebank. Computational Linguistics 19(2), 313–330 (1993)

    Google Scholar 

  6. Matsumoto, Y.: An Annotated Corpus Management Tool: ChaKi. In: Proc. 5th International Conference on Language Resources and Evaluation (LREC) (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Takenobu Tokunaga Antonio Ortega

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Matsumoto, Y. (2008). Corpus Annotation/Management Tools for the Project: Balanced Corpus of Contemporary Written Japanese. In: Tokunaga, T., Ortega, A. (eds) Large-Scale Knowledge Resources. Construction and Application. LKR 2008. Lecture Notes in Computer Science(), vol 4938. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78159-2_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78159-2_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78158-5

  • Online ISBN: 978-3-540-78159-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics