Disentangling the Latent Space of (Variational) Autoencoders for NLP

  • Gino BrunnerEmail author
  • Yuyi Wang
  • Roger Wattenhofer
  • Michael Weigelt
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 840)


We train multi-task (variational) autoencoders on linguistic tasks and analyze the learned hidden sentence representations. The representations change significantly when translation and part-of-speech decoders are added. The more decoders are attached, the better the models cluster sentences according to their syntactic similarity, as the representation space becomes less entangled. We compare standard unconstrained autoencoders to variational autoencoders and find significant differences. We achieve better disentanglement with the standard autoencoder, which goes against recent work on variational autoencoders in the visual domain.


NLP Variational Autoencoder Disentanglement Representation learning Syntax 


  1. 1.
    Bengio, Y., Courville, A.C., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013).
  2. 2.
    Burgess, C.P., Higgins, I., Pal, A., Matthey, L., Watters, N., Desjardins, G., Lerchner, A.: Understanding disentangling in \(\beta \)-VAE. arXiv preprint arXiv:180403599 (2018)
  3. 3.
    Higgins, I., Matthey, L., Pal, A., Burgess, C., Glorot, X., Botvinick, M., Mohamed, S., Lerchner, A.: \(\beta \)-VAE: Learning basic visual concepts with a constrained variational framework (2016)Google Scholar
  4. 4.
    Kingma, D.P., Welling, M.: Auto-encoding variational bayes. CoRR abs/1312.6114. (2013)
  5. 5.
    Koehn, P.: Europarl: A Parallel Corpus for Statistical Machine Translation (2005)Google Scholar
  6. 6.
    Liu, X., Gao, J., He, X., Deng, L., Duh, K., Wang, Y.: Representation learning using multi-task deep neural networks for semantic classification and information retrieval. In: NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 912–921 (2015)Google Scholar
  7. 7.
    Luong, M., Le, Q.V., Sutskever, I., Vinyals, O., Kaiser, L.: Multi-task sequence to sequence learning. CoRR abs/1511.06114 (2015)Google Scholar
  8. 8.
    Niehues, J., Cho, E.: Exploiting linguistic resources for neural machine translation using multi-task learning. In: Proceedings of the Second Conference on Machine Translation, WMT 2017, pp. 80–89 (2017)Google Scholar
  9. 9.
    Shen, T., Lei, T., Barzilay, R., Jaakkola, T.: Style transfer from non-parallel text by cross-alignment. In: Advances in Neural Information Processing Systems, pp. 6833–6844 (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Gino Brunner
    • 1
    Email author
  • Yuyi Wang
    • 1
  • Roger Wattenhofer
    • 1
  • Michael Weigelt
    • 1
  1. 1.ETH ZurichZürichSwitzerland

Personalised recommendations