Word Vector Computation Based on Implicit Expression

Wang, Xinzhi; Zhang, Hui

doi:10.1007/978-3-319-67071-3_12

Xinzhi Wang¹⁷ &
Hui Zhang¹⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 580))

Included in the following conference series:

International Conference on Applications and Techniques in Cyber Security and Intelligence

1077 Accesses

Abstract

Word vector and topic model can help retrieve information semantically to some extent. However, there are still many problems. (1) Antonyms share high similarity when clustering with word vectors. (2) Number of all kinds of name entities, such as person name, location name, and organization name is infinite while the number of one specific name entity in corpus is limited. As the result, the vectors for these name entities are not fully trained. In order to overcome above problems, this paper proposes a word vector computation model based on implicit expression. Words with the same meaning are implicitly expression based on dictionary and part of speech. With the implicit expression, the sparsity of corpus is reduced, and word vectors are trained deeper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at ICLR (2013)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS (2013)
Google Scholar
Pennington, J., Socher, R., Manning, C.: Glove: Global vectors for word representation. In: Conference on Empirical Methods in Natural Language Processing (2014)
Google Scholar
Huang, E.H., Socher, R., Manning, C.D., et al.: Improving word representations via global context and multiple word prototypes. In: Meeting of the Association for Computational Linguistics: Long Papers. Association for Computational Linguistics, pp. 873–882 (2012)
Google Scholar
Mesnil, G., He, X., Deng, L., et al.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. Interspeech (2013)
Google Scholar
Bastien, F., Lamblin, P., Pascanu, R., et al.: Theano: new features and speed improvements. Comput. Sci. (2012)
Google Scholar
Socher, R., Huang, E.H., Pennington, J., et al.: Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. Adv. Neural. Inf. Process. Syst. 24, 801–809 (2011)
Google Scholar
Rigouste, L., Cappé, O., Yvon, F.: Inference and evaluation of the multinomial mixture model for text clustering. Inf. Process. Manag. 43(5), 1260–1280 (2006)
Article Google Scholar
Hofmann, T.: Probabilistic latent semantic indexing. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 50–57. ACM (1999)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Teh, Y.W., Jordan, M.I., Beal, M.J., et al.: Hierarchical Dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566–1581 (2006)
Article MATH MathSciNet Google Scholar
Teh, Y.W., Jordan, M.I.: Hierarchical Bayesian Nonparametric Models With Applications. To appear in Bayesian Nonparametrics: Principles and Practice, pp. 158–207 (2009)
Google Scholar

Download references

Acknowledgement

This work is supported by the National Natural Science Foundation of China (Grant No. 91646201, Grant No. 91224008) and by the National Basic Research Program of China (973 Program No. 2012CB719705).

Author information

Authors and Affiliations

Institute of Public Safety Research, Department of Engineering Physics, Tsinghua University, Beijing, China
Xinzhi Wang & Hui Zhang

Authors

Xinzhi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hui Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hui Zhang .

Editor information

Editors and Affiliations

Faculty of Science, Engineering and Built Environment, Deakin University, Geelong, Victoria, Australia
Jemal Abawajy
Department of Information Systems and Cyber Security, The University of Texas at San Antonio, San Antonio, Texas, USA
Kim-Kwang Raymond Choo
School of Computing and Mathematics, Charles Sturt University, Albury, New South Wales, Australia
Rafiqul Islam

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, X., Zhang, H. (2018). Word Vector Computation Based on Implicit Expression. In: Abawajy, J., Choo, KK., Islam, R. (eds) International Conference on Applications and Techniques in Cyber Security and Intelligence. ATCI 2017. Advances in Intelligent Systems and Computing, vol 580. Edizioni della Normale, Cham. https://doi.org/10.1007/978-3-319-67071-3_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-67071-3_12
Published: 21 October 2017
Publisher Name: Edizioni della Normale, Cham
Print ISBN: 978-3-319-67070-6
Online ISBN: 978-3-319-67071-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics