Pooling Word Vector Representations Across Models

Banjade, Rajendra; Maharjan, Nabin; Gautam, Dipesh; Adrasik, Frank; Graesser, Arthur C.; Rus, Vasile

doi:10.1007/978-3-319-77113-7_2

Rajendra Banjade¹⁴,
Nabin Maharjan¹⁴,
Dipesh Gautam¹⁴,
Frank Adrasik¹⁵,
Arthur C. Graesser¹⁵ &
…
Vasile Rus¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10761))

Included in the following conference series:

International Conference on Computational Linguistics and Intelligent Text Processing

898 Accesses
1 Citations

Abstract

Vector based word representation models are typically developed from very large corpora with the hope that the representations are reliable and have wide coverage, i.e. they cover, ideally, all words. However, we often encounter words in real world applications that are not available in a single vector-based model. In this paper, we present a novel Neural Network (NN) based approach for obtaining representations for words that are missing in a target model from another model, called the source model, where representations for these words are available, effectively pooling together their vocabularies and the corresponding representations. Our experiments with three different types of pre-trained models (Word2vec, GloVe, and LSA) show that the representations obtained using our transformation approach can substantially and effectively extend the word coverage of existing models. The increase in the number of unique words covered by a model varies from few to several times depending on which model vocabulary is taken as reference. The transformed word representations are also well correlated (average correlation up to 0.801 for words in Simlex-999 dataset) with the native target model representations indicating that the transformed vectors can effectively be used as substitutes of native word representations. Furthermore, an extrinsic evaluation based on a word-to-word similarity task using the Simlex-999 dataset leads to results close to those obtained using native target model representations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://code.google.com/p/word2vec/.
2.
http://nlp.stanford.edu/data/glove.42B.300d.zip.
3.
Wiki_NVAR_f7 at http://semanticsimilarity.org/.
4.
We have used ‘token’ and ‘word’ interchangeably.
5.
In order to reduce the complexity of the model (or risk of overfitting), the number of hidden units could be set to 500 or 400 with small reduction in performance.

References

Alexandrescu, A., Kirchhoff, K.: Factored neural language models. In: Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers, pp. 1–4. Association for Computational Linguistics (2006)
Google Scholar
Banjade, R., Maharjan, N., Gautam, D., Rus, V.: Handling missing words by mapping across word vector representations. In: FLAIRS Conference, pp. 250–253 (2016)
Google Scholar
Banjade, R., et al.: Nerosim: A system for measuring and interpreting semantic textual similarity. In: Proceedings of the 9th International Workshop on SemEval (Co-located with NAACL), pp. 164–171 (2015)
Google Scholar
Baroni, M., Zamparelli, R.: Nouns are vectors, adjectives are matrices: representing adjective-noun constructions in semantic space. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 1183–1193. Association for Computational Linguistics (2010)
Google Scholar
Batchkarov, M., Kober, T., Reffin, J., Weeds, J., Weir, D.: A critique of word similarity as a method for evaluating distributional semantic models (2016)
Google Scholar
Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
MATH Google Scholar
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
MATH Google Scholar
Finkelstein, L., et al.: Placing search in context: the concept revisited. In: Proceedings of the 10th International Conference on World Wide Web, pp. 406–414. ACM (2001)
Google Scholar
Hill, F., Reichart, R., Korhonen, A.: Simlex-999: evaluating semantic models with (genuine) similarity estimation. arXiv preprint arXiv:1408.3456 (2014)
Iacobacci, I., Pilehvar, M.T., Navigli, R.: Sensembed: learning sense embeddings for word and relational similarity. In: Proceedings of ACL, pp. 95–105 (2015)
Google Scholar
Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Process. 25(2–3), 259–284 (1998)
Article Google Scholar
Lei, T., Xin, Y., Zhang, Y., Barzilay, R., Jaakkola, T.: Low-rank tensors for scoring dependency structures. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 1381–1391 (2014)
Google Scholar
Luong, M.T., Socher, R., Manning, C.D.: Better word representations with recursive neural networks for morphology. In: CoNLL-2013, p. 104 (2013)
Google Scholar
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval, vol. 1. Cambridge University Press Cambridge, New York (2008)
Book Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38(11), 39–41 (1995)
Article Google Scholar
Miller, W.R.: Motivational interviewing with problem drinkers. Behav. Psychother. 11, 147–172 (1983)
Article Google Scholar
Møller, M.F.: A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw. 6(4), 525–533 (1993)
Article Google Scholar
Nayak, N., Angeli, G., Manning, C.D.: Evaluating word embeddings using a representative suite of practical tasks. ACL 2016, 19 (2016)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014) vol. 12, pp. 1532–1543 (2014)
Google Scholar
Rus, V., Lintean, M.C., Banjade, R., Niraula, N.B., Stefanescu, D.: Semilar: the semantic similarity toolkit. In: ACL (Conference System Demonstrations), pp. 163–168. Association for Computational Linguistics (2013)
Google Scholar
Sobell, L.C., Sobell, M.B.: Motivational interviewing strategies and techniques: rationales and examples (2008). http://www.nova.edu/gsc/forms/mi_rationale_techniques.pdf
Socher, R., et al.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), vol. 1631, p. 1642 (2013)
Google Scholar
Stefanescu, D., Banjade, R., Rus, V.: Latent semantic analysis models on wikipedia and tasa. In: Proceedings of the Language Resources and Evaluation Conference, pp. 1417–1422 (2014)
Google Scholar
Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 384–394. Association for Computational Linguistics (2010)
Google Scholar
Yu, M., Dredze, M.: Improving lexical embeddings with semantic knowledge. In: Association for Computational Linguistics (ACL), pp. 545–550 (2014)
Google Scholar

Download references

Acknowledgments

The research was supported by the Office of Naval Research (N00014-00-1-0600, N00014-15-P-1184; N00014-12-C-0643; N00014-16-C-3027) and the National Science Foundation Data Infrastructure Building Blocks program (ACI-1443068). Any opinions, findings, and conclusions expressed are solely the authors’.

Author information

Authors and Affiliations

Department of Computer Science/Institute for Intelligent Systems, The University of Memphis, Memphis, TN, 38152, USA
Rajendra Banjade, Nabin Maharjan, Dipesh Gautam & Vasile Rus
Department of Psychology/Institute for Intelligent Systems, The University of Memphis, Memphis, TN, 38152, USA
Frank Adrasik & Arthur C. Graesser

Authors

Rajendra Banjade
View author publications
You can also search for this author in PubMed Google Scholar
Nabin Maharjan
View author publications
You can also search for this author in PubMed Google Scholar
Dipesh Gautam
View author publications
You can also search for this author in PubMed Google Scholar
Frank Adrasik
View author publications
You can also search for this author in PubMed Google Scholar
Arthur C. Graesser
View author publications
You can also search for this author in PubMed Google Scholar
Vasile Rus
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rajendra Banjade .

Editor information

Editors and Affiliations

CIC, Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Banjade, R., Maharjan, N., Gautam, D., Adrasik, F., Graesser, A.C., Rus, V. (2018). Pooling Word Vector Representations Across Models. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2017. Lecture Notes in Computer Science(), vol 10761. Springer, Cham. https://doi.org/10.1007/978-3-319-77113-7_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-77113-7_2
Published: 10 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77112-0
Online ISBN: 978-3-319-77113-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics