Deep Domain Adaptation for Low-Resource Cross-Lingual Text Classification Tasks

  • Guan-Yuan ChenEmail author
  • Von-Wun Soo
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1215)


Recently, the data-driven machine learning approaches have shown their successes on many text classification tasks for a resource-abundant language. However, there are still many languages that lack of sufficient enough labeled data for carrying out the same specific tasks. They may be costly to obtain high-quality parallel corpus or cannot rely on automated machine translation due to unreliable or unavailable machine translation tools in those low-resource languages. In this work, we propose an effective transfer learning method in the scenarios where the large-scale cross-lingual data is not available. It combines transfer learning schemes of parameter sharing (parameter based) and domain adaptation (feature based) that are joint trained with high-resource and low-resource languages together. We conducted the cross-lingual transfer learning experiments on text classification on sentiment, subjectivity and question types from English to Chinese and from English to Vietnamese respectively. The experiments show that the proposed approach significantly outperformed the state-of-the-art models that are trained merely with monolingual data on the corresponding benchmarks.


Cross-Lingual Deep domain adaptation Transfer learning 


  1. 1.
    Conneau, A., Lample, G., Ranzato, M., Denoyer, L., Jégou, H.: Word translation without parallel data. arXiv preprint arXiv:1710.04087 (2017)
  2. 2.
    Eriguchi, A., Johnson, M., Firat, O., Kazawa, H., Macherey, W.: Zero-shot cross-lingual classification using multilingual neural machine translation. arXiv e-prints arXiv:1809.04686, September 2018
  3. 3.
    Glavas, G., Franco-Salvador, M., Ponzetto, S.P., Rosso, P.: A resource-light method for cross-lingual semantic textual similarity. Knowl. Based Syst. 143, 1–9 (2018)CrossRefGoogle Scholar
  4. 4.
    Gretton, A., Borgwardt, K.M., Rasch, M.J., Schölkopf, B., Smola, A.: A kernel two-sample test. J. Mach. Learn. Res. 13(1), 723–773 (2012). Scholar
  5. 5.
    Gretton, A., et al.: Optimal kernel choice for large-scale two-sample tests. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1205–1213. Curran Associates, Inc. (2012).
  6. 6.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). Scholar
  7. 7.
    Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751. Association for Computational Linguistics (2014).,
  8. 8.
    Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, AAAI 2015, pp. 2267–2273. AAAI Press (2015).
  9. 9.
    Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, pp. 2278–2324 (1998)Google Scholar
  10. 10.
    Li, X., Roth, D.: Learning question classifiers. In: Proceedings of the 19th International Conference on Computational Linguistics - Volume 1, COLING 2002, pp. 1–7. Association for Computational Linguistics, Stroudsburg (2002).
  11. 11.
    Long, M., Cao, Y., Wang, J., Jordan, M.I.: Learning transferable features with deep adaptation networks. In: Proceedings of the 32Nd International Conference on International Conference on Machine Learning - Volume 37, ICML 2015, pp. 97–105. (2015).
  12. 12.
    Mohammad, S., Salameh, M., Kiritchenko, S.: Sentiment lexicons for Arabic social media. In: Chair, N.C.C., et al. (eds.) Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). European Language Resources Association (ELRA), Paris, France, May 2016Google Scholar
  13. 13.
    Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010). Scholar
  14. 14.
    Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity. In: Proceedings of ACL, pp. 271–278 (2004)Google Scholar
  15. 15.
    Shi, H., Ushio, T., Endo, M., Yamagami, K., Horii, N.: A multichannel convolutional neural network for cross-language dialog state tracking. In: 2016 IEEE Spoken Language Technology Workshop (SLT), pp. 559–564 (2016)Google Scholar
  16. 16.
    Smith, S.L., Turban, D.H.P., Hamblin, S., Hammerla, N.Y.: Offline bilingual word vectors, orthogonal transformations and the inverted softmax. CoRR abs/1702.03859 (2017).
  17. 17.
    Socher, R., et al.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1631–1642. Association for Computational Linguistics, Stroudsburg, October 2013Google Scholar
  18. 18.
    Tzeng, E., Hoffman, J., Zhang, N., Saenko, K., Darrell, T.: Deep domain confusion: maximizing for domain invariance. CoRR abs/1412.3474 (2014).
  19. 19.
    Upadhyay, S., Faruqui, M., Tur, G., Hakkani-Tur, D., Heck, L.: (almost) zero-shot cross-lingual spoken language understanding. In: Proceedings of the IEEE ICASSP (2018)Google Scholar
  20. 20.
    Wan, X.: Using bilingual knowledge and ensemble techniques for unsupervised chinese sentiment analysis. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP2008, pp. 553–561. Association for Computational Linguistics, Stroudsburg (2008).
  21. 21.
    Wan, X.: Co-training for cross-lingual sentiment classification. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1, ACL 2009, pp. 235–243. Association for Computational Linguistics, Stroudsburg (2009).
  22. 22.
    Wang, D., Peng, N., Duh, K.: A multi-task learning approach to adapting bilingual word embeddings for cross-lingual named entity recognition. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 383–388. Asian Federation of Natural Language Processing (2017).
  23. 23.
    Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1, NIPS 2015, pp. 649–657. MIT Press, Cambridge (2015).
  24. 24.
    Zhou, X., Wan, X., Xiao, J.: Attention-based LSTM network for cross-lingual sentiment classification. In: EMNLP (2016)Google Scholar
  25. 25.
    Zhou, X., Wan, X., Xiao, J.: Cross-lingual sentiment classification with bilingual document representation learning. In: ACL (2016)Google Scholar
  26. 26.
    Zoph, B., Yuret, D., May, J., Knight, K.: Transfer learning for low-resource neural machine translation. In: EMNLP (2016)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.Institute of Information Systems and ApplicationsNational Tsing Hua UniversityHsinchuTaiwan
  2. 2.Telecommunication LaboratoriesChunghwa TelecomTaoyuanTaiwan
  3. 3.Department of Computer ScienceNational Tsing Hua UniversityHsinchuTaiwan

Personalised recommendations