AugGAN: Cross Domain Adaptation with GAN-Based Data Augmentation

  • Sheng-Wei HuangEmail author
  • Che-Tsung Lin
  • Shu-Ping Chen
  • Yen-Yi Wu
  • Po-Hao Hsu
  • Shang-Hong Lai
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11213)


Deep learning based image-to-image translation methods aim at learning the joint distribution of the two domains and finding transformations between them. Despite recent GAN (Generative Adversarial Network) based methods have shown compelling results, they are prone to fail at preserving image-objects and maintaining translation consistency, which reduces their practicality on tasks such as generating large-scale training data for different domains. To address this problem, we purpose a structure-aware image-to-image translation network, which is composed of encoders, generators, discriminators and parsing nets for the two domains, respectively, in a unified framework. The purposed network generates more visually plausible images compared to competing methods on different image-translation tasks. In addition, we quantitatively evaluate different methods by training Faster-RCNN and YOLO with datasets generated from the image-translation results and demonstrate significant improvement on the detection accuracies by using the proposed image-object preserving network.


Generative adversarial network Image-to-image translation Semantic segmentation Object detection Domain adaptation 

Supplementary material (47.9 mb)
Supplementary material 1 (zip 49080 KB)
474192_1_En_44_MOESM2_ESM.pdf (82 kb)
Supplementary material 2 (pdf 82 KB)

Supplementary material 3 (mp4 18704 KB)


  1. 1.
    Goodfellow, I., et al.: Generative adversarial nets. In: NIPS (2014)Google Scholar
  2. 2.
    Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. CVPR (2017)Google Scholar
  3. 3.
    Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)Google Scholar
  4. 4.
    Kim, T., Cha, M., Kim, H., Lee, J., Kim, J.: Learning to discover cross-domain relations with generative adversarial networks. arXiv preprint arXiv:1703.05192 (2017)
  5. 5.
    Yi, Z., Zhang, H., Tan, P., Gong, M.: Dualgan: unsupervised dual learning for image-to-image translation. arXiv preprint (2017)Google Scholar
  6. 6.
    Liu, M.Y., Tuzel, O.: Coupled generative adversarial networks. In: NIPS (2016)Google Scholar
  7. 7.
    Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: NIPS (2017)Google Scholar
  8. 8.
    Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The synthia dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: CVPR (2016)Google Scholar
  9. 9.
    Richter, S.R., Vineet, V., Roth, S., Koltun, V.: Playing for data: ground truth from computer games. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 102–118. Springer, Cham (2016). Scholar
  10. 10.
    Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: CVPR (2012)Google Scholar
  11. 11.
    Sivaraman, S., Trivedi, M.M.: A general active-learning framework for on-road vehicle recognition and tracking. IEEE Trans. Intell. Transp. Syst. 11(2), 267–276 (2010)CrossRefGoogle Scholar
  12. 12.
    Zhou, Y., Liu, L., Shao, L., Mellor, M.: DAVE: a unified framework for fast vehicle detection and annotation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 278–293. Springer, Cham (2016). Scholar
  13. 13.
    Yang, L., Luo, P., Change Loy, C., Tang, X.: A large-scale car dataset for fine-grained categorization and verification. In: CVPR (2015)Google Scholar
  14. 14.
    Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D object representations for fine-grained categorization. In: ICCV Workshops (2013)Google Scholar
  15. 15.
    Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The Pascal visual object classes (VOC) challenge. IJCV 88, 303 (2010)CrossRefGoogle Scholar
  16. 16.
    Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR (2016)Google Scholar
  17. 17.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)Google Scholar
  18. 18.
    Xiang, Y., Choi, W., Lin, Y., Savarese, S.: Subcategory-aware convolutional neural networks for object proposals and detection. In: WACV (2017)Google Scholar
  19. 19.
    Nowlan, S.J., Hinton, G.E.: Simplifying neural networks by soft weight-sharing. Neural Comput. 4(4), 473–493 (1992)CrossRefGoogle Scholar
  20. 20.
    Ullrich, K., Meeds, E., Welling, M.: Soft weight-sharing for neural network compression. arXiv preprint arXiv:1702.04008 (2017)

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Department of Computer ScienceNational Tsing Hua UniversityHsinchuTaiwan
  2. 2.Intelligent Mobility Division, Mechanical and Mechatronics Systems Research LaboratoriesIndustrial Technology Research InstituteZhudongTaiwan

Personalised recommendations