An Adversarial Joint Learning Model for Low-Resource Language Semantic Textual Similarity

  • Junfeng Tian
  • Man LanEmail author
  • Yuanbin Wu
  • Jingang Wang
  • Long Qiu
  • Sheng Li
  • Lang Jun
  • Luo Si
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10772)


Semantic Textual Similarity (STS) of low-resource language is a challenging research problem with practical applications. Traditional solutions employ machine translation techniques to translate the low-resource languages to some resource-rich languages such as English. Hence, the final performance is highly dependent on the quality of machine translation. To decouple the machine translation dependency while still take advantage of the data in resource-rich languages, this work proposes to jointly learn the low-resource language STS task and that of a resource-rich one, which only relies on multilingual word embeddings. In particular, we project the low-resource language word embeddings into the semantic space of the resource-rich language via a translation matrix. To make the projected word embeddings resemble that of the resource-rich language, a language discriminator is introduced as an adversarial teacher. Thus the parameters of sentence similarity neural networks of two tasks can be effectively shared. The plausibility of our model is demonstrated by extensive experimental results.


Semantic Textual Similarity Low-resource language Neural networks Adversarial learning 



We would like to thank the reviewers for their valuable comments. This work is supported by grants from Science and Technology Commission of Shanghai Municipality (15ZR1410700), the Open Project of Shanghai Key Laboratory of Trustworthy Computing (No. 07dz22304201604).


  1. 1.
    Artetxe, M., Labaka, G., Agirre, E.: Learning bilingual word embeddings with (almost) no bilingual data. In: Proceedings of ACL, pp. 451–462, July 2017Google Scholar
  2. 2.
    Béchara, H., Escartín, C.P., Orasan, C., Specia, L.: Semantic textual similarity in quality estimation. Baltic J. Mod. Comput. 4(2), 256 (2016)Google Scholar
  3. 3.
    Cer, D., Diab, M., Agirre, E., Lopez-Gazpio, I., Specia, L.: Semeval-2017 task 1: Semantic textual similarity multilingual and crosslingual focused evaluation. In: Proceedings of SemEval, pp. 1–14 (2017)Google Scholar
  4. 4.
    Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of ICML, pp. 160–167 (2008)Google Scholar
  5. 5.
    Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation. In: Proceedings of ICML, pp. 1180–1189 (2015)Google Scholar
  6. 6.
    Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Proceeding of NIPS, pp. 2672–2680 (2014)Google Scholar
  7. 7.
    He, H., Gimpel, K., Lin, J.J.: Multi-perspective sentence similarity modeling with convolutional neural networks. In: Proceedings of EMNLP, pp. 1576–1586 (2015)Google Scholar
  8. 8.
    Hermann, K.M., Blunsom, P.: Multilingual distributed representations without word alignment. In: Proceedings of ICLR (2014)Google Scholar
  9. 9.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  10. 10.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014)Google Scholar
  11. 11.
    Lan, M., Wang, J., Wu, Y., Niu, Z.Y., Wang, H.: Multi-task attention-based neural networks for implicit discourse relationship representation and identification. In: Proceedings of EMNLP, pp. 1310–1319 (2017)Google Scholar
  12. 12.
    Lan, M., Wu, G., Xiao, C., Wu, Y., Wu, J.: Building mutually beneficial relationships between question retrieval and answer ranking to improve performance of community question answering. In: Proceedings of IJCNN (2016)Google Scholar
  13. 13.
    Li, Y., McLean, D., Bandar, Z.A., O’shea, J.D., Crockett, K.: Sentence similarity based on semantic nets and corpus statistics. IEEE Trans. Knowl. Data Eng. 18(8), 1138–1150 (2006)CrossRefGoogle Scholar
  14. 14.
    Liu, P., Qiu, X., Chen, J., Huang, X.: Deep fusion lstms for text semantic matching. In: Proceedings of ACL (2016)Google Scholar
  15. 15.
    Liu, P., Qiu, X., Huang, X.: Deep multi-task learning with shared memory for text classification. In: Proceedings of EMNLP, pp. 118–127 (2016)Google Scholar
  16. 16.
    Liu, P., Qiu, X., Huang, X.: Adversarial multi-task learning for text classification. In: Proceeding of ACL, pp. 1–10 (2017)Google Scholar
  17. 17.
    Liu, Y., Li, S., Zhang, X., Sui, Z.: Implicit discourse relation classification via multi-task neural networks. arXiv preprint arXiv:1603.02776 (2016)
  18. 18.
    Lo, C.k., Wu, D.: Meant: an inexpensive, high-accuracy, semi-automatic metric for evaluating translation utility via semantic frames. In: Proceedings of ACL, pp. 220–229 (2011)Google Scholar
  19. 19.
    Luong, M.T., Le, Q.V., Sutskever, I., Vinyals, O., Kaiser, L.: Multi-task sequence to sequence learning. In: Proceedings of ICLR (2016)Google Scholar
  20. 20.
    Mihalcea, R., Corley, C., Strapparava, C., et al.: Corpus-based and knowledge-based measures of text semantic similarity. In: Proceedings of AAAI (2006)Google Scholar
  21. 21.
    Mihalcea, R., Tarau, P.: Textrank: bringing order into texts. In: Proceedings of ACL (2004)Google Scholar
  22. 22.
    Mikolov, T., Le, Q.V., Sutskever, I.: Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168 (2013)
  23. 23.
    Mueller, J., Thyagarajan, A.: Siamese recurrent architectures for learning sentence similarity. In: Proceedings of AAAI, pp. 2786–2792 (2016)Google Scholar
  24. 24.
    Nagwani, N.K., Verma, S.: A frequent term and semantic similarity based single document text summarization algorithm (0975–8887). Int. J. Comput. Appl. 17, 36–40 (2011)Google Scholar
  25. 25.
    Park, G., Im, W.: Image-text multi-modal representation learning by adversarial backpropagation. arXiv preprint arXiv:1612.08354 (2016)
  26. 26.
    Smith, S.L., Turban, D.H.P., Hamblin, S., Hammerla, N.Y.: Offline bilingual word vectors, orthogonal transformations and the inverted softmax. In: Proceedings of ICLR (2017)Google Scholar
  27. 27.
    Sultan, M.A., Bethard, S., Sumner, T.: Dls\(@\)cu: Sentence similarity from word alignment and semantic vector composition. In: Proceedings of SemEval (2015)Google Scholar
  28. 28.
    Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of ACL (2015)Google Scholar
  29. 29.
    Šarić, F., Glavaš, G., Karan, M., Šnajder, J., Dalbelo Bašić, B.: Takelab: systems for measuring semantic text similarity. In: Proceedings of SemEval (2012)Google Scholar
  30. 30.
    Wang, B., Liu, K., Zhao, J.: Inner attention based recurrent neural networks for answer selection. In: Proceedings of ACL (2016)Google Scholar
  31. 31.
    Wieting, J., Gimpel, K.: Revisiting recurrent networks for paraphrastic sentence embeddings. In: Proceedings of ACL, pp. 2078–2088 (2017)Google Scholar
  32. 32.
    Yanaka, H., Mineshima, K., Martínez-Gómez, P., Bekki, D.: Determining semantic textual similarity using natural deduction proofs. In: Proceedings of EMNLP, pp. 681–691 (2017)Google Scholar
  33. 33.
    Zhang, M., Liu, Y., Luan, H., Sun, M.: Adversarial training for unsupervised bilingual lexicon induction. In: Proceedings of ACL, pp. 1959–1970 (2017)Google Scholar
  34. 34.
    Zhao, J., Zhu, T., Lan, M.: Ecnu: one stone two birds: ensemble of heterogenous measures for semantic relatedness and textual entailment. In: Proceedings of SemEval (2014)Google Scholar
  35. 35.
    Zou, W.Y., Socher, R., Cer, D., Manning, C.D.: Bilingual word embeddings for phrase-based machine translation. In: Proceedings of EMNLP (2013)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Junfeng Tian
    • 1
  • Man Lan
    • 1
    • 2
    Email author
  • Yuanbin Wu
    • 1
    • 2
  • Jingang Wang
    • 3
  • Long Qiu
    • 4
  • Sheng Li
    • 3
  • Lang Jun
    • 3
  • Luo Si
    • 3
  1. 1.School of Computer Science and Software EngineeringEast China Normal UniversityShanghaiPeople’s Republic of China
  2. 2.Shanghai Key Laboratory of Multidimensional Information ProcessingShanghaiChina
  3. 3.iDST, Alibaba GroupHangzhouChina
  4. 4.Onehome (Beijing) Network Technology Co. Ltd.BeijingChina

Personalised recommendations