Skip to main content

Attention-Based Chinese Word Embedding

  • Conference paper
  • First Online:
Cloud Computing and Security (ICCCS 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11066))

Included in the following conference series:

Abstract

Recent studies have shown that the internal composition of the Chinese word provides rich semantic information for Chinese word representation. The Chinese word consists of one or more Chinese characters. Chinese characters have semantic information. And some Chinese characters have multiple meanings. Moreover, the composition of Chinese characters has different semantic contributions to word. In response to this phenomenon, this paper proposes a new attention-based model (ACWE) to learn Chinese word representation. At the same time, the “HIT IR-Lab Tongyici Cilin (Extended Version)” can calculate the semantic similarity between Chinese characters and words. And it can reduce the impact of data sparseness and improve the effectiveness of Chinese word representation. We evaluate the ACWE model from the similarity task and the analogical reasoning task, and the experimental results show that the ACWE model is superior to the existing baseline model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Mei, J., Zheng, Y., Gao, Y., Yin, H.: TongYiCiCiLin. The Commercial Press, Shanghai (1984)

    Google Scholar 

  2. Cao, S., Lu, W., Zhou, J., Li, X.: cw2vec: Learning Chinese word embeddings with stroke n-gram information. Association for the Advancement of Artificial Intelligence, pp. 158–160 (2018)

    Google Scholar 

  3. Li, Z.: Parsing the internal structure of words: a new paradigm for chinese word segmentation. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, pp. 1405–1414 (2011)

    Google Scholar 

  4. Li, M., Zong, C., Ng, H.T.: Automatic evaluation of Chinese translation output: word-level or character-level. In: Proceedings of ACL, pp. 159–164 (2011)

    Google Scholar 

  5. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS, pp. 3111–3119 (2013)

    Google Scholar 

  6. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv, pp. 131–145 (2013)

    Google Scholar 

  7. Botha, J.A., Blunsom, P.: Compositional morphology for word representations and language modelling. In: Proceedings of ICML, pp. 1899–1907 (2014)

    Google Scholar 

  8. Hermann, K.M., Blunsom, P.: Multilingual models for compositional distributed semantics. arXiv preprint arXiv, pp. 4–14 (2014)

    Google Scholar 

  9. Chen, X., Xu, L., Liu, Z., Sun, M., Luan, H.: Joint learning of character and word embeddings. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI), pp. 101–115 (2015)

    Google Scholar 

  10. Li, Y., Li, W., Sun, F., Li, S.: Component-enhanced Chinese character embeddings. arXiv preprint arXiv, pp. 8–15 (2015)

    Google Scholar 

  11. Lai, S., Liu, K., Xu, L., Zhao, J.: How to Generate a Good Word Embedding. arXiv, pp. 7–18 (2015)

    Google Scholar 

  12. Myers, J.L., Well, A., Lorch, R.F.: Research design and statistical analysis. pp. 29–41 (2010)

    Google Scholar 

  13. Xu, J., Liu, J., Zhang, L., Chen, H.: Improve Chinese word embeddings by exploiting internal structure. In: Proceedings of NAACL-HLT 2016, pp. 1041–1050 (2016)

    Google Scholar 

  14. Chen, X., Jin, P., McCarthy, D., Carroll, J.: Integrating character representations into chinese word embedding. Chinese Lexical Semantics. LNCS (LNAI), vol. 10085, pp. 335–349. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49508-8_32

    Chapter  Google Scholar 

  15. Cao, K., Rei, M.: A joint model for word embedding and word morphology. In: Proceedings of the 1st Workshop on Representation Learning for NLP, pp. 18–26 (2016)

    Google Scholar 

  16. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching Word Vectors with Subword Information. arXiv, pp. 7–16 (2016)

    Google Scholar 

  17. Jin, P., Wu, Y.: Semeval-2012 task 4: evaluating Chinese word similarity. In: Proceedings of the Sixth International Workshop on Semantic Evaluation, pp. 374–377 (2012)

    Google Scholar 

  18. Su, T.-R., Lee, H.-Y.: Learning Chinese Word Representations From Glyphs Of Characters. arXiv, pp. 17–28 (2017)

    Google Scholar 

  19. Zamani, H., Crof, W.B.: Relevance-based Word Embedding. arXiv, pp. 5–17 (2017)

    Google Scholar 

  20. Yu, J., Jian, X., Xin, H., Song, Y.: Joint embeddings of Chinese words, characters, and fine-grained subcharacter components. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 286–291 (2017)

    Google Scholar 

  21. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Cogn. Model. 3–5 (1988)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liang, Y., Zhang, W., Yang, K. (2018). Attention-Based Chinese Word Embedding. In: Sun, X., Pan, Z., Bertino, E. (eds) Cloud Computing and Security. ICCCS 2018. Lecture Notes in Computer Science(), vol 11066. Springer, Cham. https://doi.org/10.1007/978-3-030-00015-8_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00015-8_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00014-1

  • Online ISBN: 978-3-030-00015-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics