Advertisement

An Assessment of Substitute Words in the Context of Academic Writing Proposed by Pre-trained and Specific Word Embedding Models

  • Chooi Ling GohEmail author
  • Yves Lepage
Conference paper
  • 16 Downloads
Part of the Communications in Computer and Information Science book series (CCIS, volume 1215)

Abstract

Researchers who are non-native speakers of English always face some problems when composing scientific articles in this language. Most of the time, it is due to lack of vocabulary or knowledge of alternate ways of expression. In this paper, we suggest to use word embeddings to look for substitute words used for academic writing in a specific domain. Word embeddings may not only contain semantically similar words but also other words with similar word vectors, that could be better expressions. A word embedding model trained on a collection of academic articles in a specific domain might suggest similar expressions that comply to that writing style and are suited to that domain. Our experiment results show that a word embedding model trained on the NLP domain is able to propose possible substitutes that could be used to replace the target words in a certain context.

Keywords

Word embedding Word similarity Dictionary lookup Synonym Academic writing 

Notes

Acknowledgment

This work was supported by JSPS KAKENHI Grant Number JP18K11446 .

References

  1. 1.
    Antoniak, M., Mimno, D.: Evaluating the stability of embedding-based word similarities. Trans. Assoc. Comput. Linguist. 6, 107–119 (2018)CrossRefGoogle Scholar
  2. 2.
    Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5(1), 135–146 (2017)CrossRefGoogle Scholar
  3. 3.
    Denkowski, M., Lavie, A.: Meteor universal: language specific translation evaluation for any target language. In: Proceedings of the EACL 2014 Workshop on Statistical Machine Translation, pp. 376–380 (2014)Google Scholar
  4. 4.
    Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp. 4171–4186 (June 2019)Google Scholar
  5. 5.
    Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psychol. Bull. 76(5), 378–382 (1971)CrossRefGoogle Scholar
  6. 6.
    Leeuwenberg, A., Vela, M., Dehdari, J., Genabith, J.: A minimally supervised approach for synonym extraction with word embeddings. Prague Bull. Math. Linguist. 105, 111–142 (2016)CrossRefGoogle Scholar
  7. 7.
    Melamud, O., Dagan, I., Goldberger, J.: Modeling word meaning in context with substitute vectors. In: Proceedings of the NAACL, pp. 472–482 (2015)Google Scholar
  8. 8.
    Melamud, O., McClosky, D., Patwardhan, S., Bansal, M.: The role of context types and dimensionality in learning word embeddings. In: Proceedings of the NAACL-HLT, pp. 1030–1040 (June 2016)Google Scholar
  9. 9.
    Mikolov, T., Chen, K., Corrado, G.S., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at ICLR (2013)Google Scholar
  10. 10.
    Mikolov, T., Grave, E., Bojanowski, P., Puhrsch, C., Joulin, A.: Advances in pre-training distributed word representations. In: Proceedings of LREC (2018)Google Scholar
  11. 11.
    Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)CrossRefGoogle Scholar
  12. 12.
    Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of ACL, pp. 311–318 (July 2002)Google Scholar
  13. 13.
    Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: Proceedings of EMNLP, pp. 1532–1543 (2014)Google Scholar
  14. 14.
    Peters, M.E., et al.: Deep contextualized word representations. In: Proceedings of NAACL-HLT, pp. 2227–2237 (June 2018)Google Scholar
  15. 15.
    Yatbaz, M.A., Sert, E., Yuret, D.: Learning syntactic categories using paradigmatic representations of word context. In: Proceedings of the EMNLP-CoNLL, pp. 940–951 (July 2012)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.The University of KitakyushuKitakyushu, FukuokaJapan
  2. 2.Waseda UniversityKitakyushu, FukuokaJapan

Personalised recommendations