Skip to main content

A Method of Pre-computing Connectivity Relations for Japanese/Korean POS Tagging

  • Conference paper
  • First Online:
  • 762 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2004))

Abstract

This paper presents an efficient dictionary structure of Part- of-Speech(POS) Tagging for Japanese/Korean by extending Aho and Corasick’s pattern matching machine. The proposed method is a simple and fast algorithm to find all possible morphemes in an input sentence and in a single pass, and it stores the relations of grammatical connec- tivity of neighboring morphemes into the output functions. Therefore, the proposed method can reduce both costs of the dictionary lookup and the connection check to find the most suitable word segmentation. From the simulation results, it turns out that the proposed method was 21.8% faster (CPU time) than the general approach using the trie structure. Concerning the number of candidates for checking connections, it was 27.4% less than that of the original morphological analysis.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abe, M. Ooshima, Y., Yuura, K., and Takeichi, N.: A Kana-Kanji Translation System for Non-Segmented Input Sentences Based on Syntactic and Semantic Analysis. Proceedings of the 10th International Conference on Computational Linguistics (1986) pp.280-pp.285.

    Google Scholar 

  2. Aho, A.V., and Corasick, M.J.: Efficient String Matching: An Aid to Bibliographic Search. Communications of the ACM, Vol.18, No.6 (1975) pp.333–340

    Article  MathSciNet  Google Scholar 

  3. Akiba, T., Tokunaga, T., and Tanaka, H.: An Extension of LangLAB for Japanese Morphological Analysis. Proceedings of the International Workshop on Sharable Natural Language Resources (1994) pp.36–42

    Google Scholar 

  4. Aoe, J.: An Efficient Digital Search Algorithm by Using a Double-Array Structure. IEEE Transactions on Software Engineering, vol.SE-15 (1989) pp. 1066–1077

    Article  Google Scholar 

  5. Aoe, J.: Computer Algorithms: Key Search Strategies. IEEE Computer Society Press (1991)

    Google Scholar 

  6. Aoe, J., Morimoto, K., Shishibori, M., and Park, K.H.: A Trie Compaction Algorithm for a Large Set of Keys. IEEE Transactions of Knowledge and Date Engineering, Vol.8, No.3 (1996) pp.476–491

    Article  Google Scholar 

  7. Kaplan, S.J.: Designing a Portable Natural Language Database Query System. ACM Transactions on Database Systems, Vol.9, No.1 (1984) pp.1–29

    Article  Google Scholar 

  8. Knuth D.E., Morris, J.H., and Pratt, V.R.: Fast pattern matching in strings. SIAM Journal on Computing, vol.6, No.2 (1977) pp. 323–350

    Article  MathSciNet  Google Scholar 

  9. Kurohashi, S., Nakamura, T., Matsumoto, Y., and Nagao, M.: Improvements of Japanese Morphological Analyzer JUMAN. Proceedings of the International Workshop on Sharable Natural Language Resources (1994) pp.22–28

    Google Scholar 

  10. Maruyama, H.: Backtracking-Free Dictionary Access Method for Japanese Morphological Analysis. Proceedings of the 15th International Conference on Computational Linguistics (1994) pp.208–213.

    Google Scholar 

  11. Mori, S.: High Speed Morphological Analysis using DFA. Technical report of IEICE of Japan, NLC96-23 (1996), pp.17–23 (in Japanese)

    Google Scholar 

  12. Sano, H., Kawada, R., and Hasimoto, M.: Morphological Grammar Rules: An Implementation for JUMAN. Proceedings of the International Workshop on Sharable Natural Language Resources (1994) pp.29–35

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ando, K., Lee, Th., Shishibori, M., Aoe, Ji. (2001). A Method of Pre-computing Connectivity Relations for Japanese/Korean POS Tagging. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2001. Lecture Notes in Computer Science, vol 2004. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44686-9_36

Download citation

  • DOI: https://doi.org/10.1007/3-540-44686-9_36

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41687-6

  • Online ISBN: 978-3-540-44686-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics