Corpus Annotation/Management Tools for the Project: Balanced Corpus of Contemporary Written Japanese

Matsumoto, Yuji

doi:10.1007/978-3-540-78159-2_11

Yuji Matsumoto¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4938))

Included in the following conference series:

International Conference on Large-Scale Knowledge Resources

609 Accesses

Abstract

This paper introduces our activities on corpus annotation and management tool development in the Japanese government funded project, Balanced Corpus of Contemporary Written Japanese. We are investigating various levels of text annotation that covers morphological and POS tagging, syntactic dependency parsing, predicate-argument analysis, and coreference analysis. Since automatic annotation is not perfect, we need annotated corpus management tools that facilitate corpus browsing and error correction. We especially take up our corpus management tool ChaKi, explains its functions, and discuss how we are trying to maintain consistency of corpus annotation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Asahara, M., Matsumoto, Y.: Japanese named entity extraction with redundant morphological analysis. In: Proc. Human Language Technology and North American Chapter of Association for Computational Linguistics, pp. 8–15 (2003)
Google Scholar
Iida, R., Inui, K., Matsumoto, Y.: Exploiting Syntactic Patterns as Clues in Zero-Anaphora Resolution. In: ACL-Coling-2006, pp. 625–632 (2006)
Google Scholar
Kudo, T., Matsumoto, Y.: Japanese Dependency Analysis using Cascaded Chunking. In: 6th Conference on Natural Language Learning, pp. 63–69 (2002)
Google Scholar
Maekawa, K.: KOTONOHA and BCCWJ: Development of a Balanced Corpus of Contemporary Written Japanese. In: Corpora and Language Research: Proceedings of the First International Conference on Korean Language, Literature, and Culture, pp. 158–177 (2007)
Google Scholar
Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a Large Annotated Corpus of English: The Penn Treebank. Computational Linguistics 19(2), 313–330 (1993)
Google Scholar
Matsumoto, Y.: An Annotated Corpus Management Tool: ChaKi. In: Proc. 5th International Conference on Language Resources and Evaluation (LREC) (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Information Science, Nara Institute of Science and Technology, Takayama, Ikoma, Nara 630-0192, Japan
Yuji Matsumoto

Authors

Yuji Matsumoto
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Takenobu Tokunaga Antonio Ortega

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Matsumoto, Y. (2008). Corpus Annotation/Management Tools for the Project: Balanced Corpus of Contemporary Written Japanese. In: Tokunaga, T., Ortega, A. (eds) Large-Scale Knowledge Resources. Construction and Application. LKR 2008. Lecture Notes in Computer Science(), vol 4938. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78159-2_11

Download citation

DOI: https://doi.org/10.1007/978-3-540-78159-2_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78158-5
Online ISBN: 978-3-540-78159-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics