CRAFT: Complementary Recommendation by Adversarial Feature Transform

  • Cong Phuoc HuynhEmail author
  • Arridhana Ciptadi
  • Ambrish Tyagi
  • Amit Agrawal
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11131)


We propose a framework that harnesses visual cues in an unsupervised manner to learn the co-occurrence distribution of items in real-world images for complementary recommendation. Our model learns a non-linear transformation between the two manifolds of source and target item categories (e.g., tops and bottoms in outfits). Given a large dataset of images containing instances of co-occurring items, we train a generative transformer network directly on the feature representation by casting it as an adversarial optimization problem. Such a conditional generative model can produce multiple novel samples of complementary items (in the feature space) for a given query item. We demonstrate our framework for the task of recommending complementary top apparel for a given bottom clothing item. The recommendations made by our system are diverse, and are favored by human experts over the baseline approaches.


Recommender systems Complementary recommendation Generative adversarial network Unsupervised learning Adversarial learning 


  1. 1.
    Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749 (2005)CrossRefGoogle Scholar
  2. 2.
    Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., Krishnan, D.: Unsupervised pixel-level domain adaptation with generative adversarial networks. In: CVPR, pp. 95–104 (2017)Google Scholar
  3. 3.
    Che, T., Li, Y., Jacob, A.P., Bengio, Y., Li, W.: Mode regularized generative adversarial networks. In: ICLR (2017)Google Scholar
  4. 4.
    Croitoru, I., Bogolin, S.V., Leordeanu, M.: Unsupervised learning from video to detect foreground objects in single images. In: ICCV, pp. 4335–4343 (2017)Google Scholar
  5. 5.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)Google Scholar
  6. 6.
    Doersch, C., Zisserman, A.: Multi-task self-supervised visual learning. In: ICCV (2017)Google Scholar
  7. 7.
    Donahue, J., Krähenbühl, P., Darrell, T.: Adversarial feature learning. In: International Conference on Learning Representations (2017)Google Scholar
  8. 8.
    Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S.: Generative adversarial nets. In: NIPS, pp. 2672–2680 (2014)Google Scholar
  9. 9.
    Han, X., Wu, Z., Jiang, Y.G., Davis, L.S.: Learning fashion compatibility with bidirectional LSTMs. In: ACM on Multimedia Conference, pp. 1078–1086 (2017)Google Scholar
  10. 10.
    He, R., Packer, C., McAuley, J.: Learning compatibility across categories for heterogeneous item recommendation. In: IEEE 16th International Conference on Data Mining, ICDM, pp. 937–942 (2016)Google Scholar
  11. 11.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML, pp. 448–456 (2015)Google Scholar
  12. 12.
    Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial nets. In: CVPR (2017)Google Scholar
  13. 13.
    Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. In: ICLR (2018)Google Scholar
  14. 14.
    Kiapour, M.H., Han, X., Lazebnik, S., Berg, A.C., Berg, T.L.: Where to buy it: matching street clothing photos in online shops. In: ICCV (2015)Google Scholar
  15. 15.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)Google Scholar
  16. 16.
    Koren, Y., Bell, R.: Advances in collaborative filtering. In: Ricci, F., Rokach, L., Shapira, B., Kantor, P.B. (eds.) Recommender Systems Handbook, pp. 145–186. Springer, Boston, MA (2011)CrossRefGoogle Scholar
  17. 17.
    Lample, G., Zeghidour, N., Usunier, N., Bordes, A., Denoyer, L., Ranzato, M.: Fader networks: manipulating images by sliding attributes. In: Advances in Neural Information Processing Systems, pp. 5963–5972 (2017)Google Scholar
  18. 18.
    Lew, M.S., Sebe, N., Djeraba, C., Jain, R.: Content-based multimedia information retrieval: state of the art and challenges. ACM Trans. Multimed. Comput. Commun. Appl. 2(1), 1–19 (2006)CrossRefGoogle Scholar
  19. 19.
    Liang, X., Liu, S., Shen, X., Yang, J., Liu, L., Dong, J., Lin, L., Yan, S.: Deep human parsing with active template regression. IEEE Trans. Pattern Anal. Mach. Intell. 37(12), 2402–2414 (2015)CrossRefGoogle Scholar
  20. 20.
    Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014)Google Scholar
  21. 21.
    Liu, S., Song, Z., Wang, M., Xu, C., Lu, H., Yan, S.: Street-to-shop: cross-scenario clothing retrieval via parts alignment and auxiliary set. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, pp. 3330–3337 (2012)Google Scholar
  22. 22.
    Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: ICML, vol. 30 (2013)Google Scholar
  23. 23.
    van der Maaten, L., Hinton, G.: Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)zbMATHGoogle Scholar
  24. 24.
    McAuley, J.J., Targett, C., Shi, Q., van den Hengel, A.: Image-based recommendations on styles and substitutes. In: SIGIR (2015)Google Scholar
  25. 25.
    Melville, P., Mooney, R.J., Nagarajan, R.: Content-boosted collaborative filtering for improved recommendations. In: Eighteenth National Conference on Artificial Intelligence, pp. 187–192 (2002)Google Scholar
  26. 26.
    Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: NIPS, pp. 2234–2242 (2016)Google Scholar
  27. 27.
    Shrivastava, A., Pfister, T., Tuzel, O., Susskind, J., Wang, W., Webb, R.: Learning from simulated and unsupervised images through adversarial training. In: CVPR (2017)Google Scholar
  28. 28.
    Szegedy, C., Ioffe, S., Vanhoucke, V.: Inception-v4, inception-resnet and the impact of residual connections on learning. AAAI abs/1602.07261 (2017)Google Scholar
  29. 29.
    Tzeng, E., Hoffman, J., Darrell, T., Saenko, K.: Adversarial discriminative domain adaptation. In: CVPR (2017)Google Scholar
  30. 30.
    Veit, A., Kovacs, B., Bell, S., McAuley, J., Bala, K., Belongie, S.: Learning visual clothing style with heterogeneous dyadic co-occurrences. In: International Conference on Computer Vision (ICCV) (2015)Google Scholar
  31. 31.
    Volpi, R., Morerio, P., Savarese, S., Murino, V.: Adversarial feature augmentation for unsupervised domain adaptation. In: Computer Vision and Pattern Recognition (2018)Google Scholar
  32. 32.
    Vondrick, C., Torralba, A.: Generating the future with adversarial transformers. In: CVPR (2017)Google Scholar
  33. 33.
    Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional gans. In: Computer Vision and Pattern Recognition (2018)Google Scholar
  34. 34.
    Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017)Google Scholar
  35. 35.
    Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)Google Scholar
  36. 36.
    Zhu, S., Fidler, S., Urtasun, R.: Be your own prada: fashion synthesis with structural coherence. In: ICCV (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Cong Phuoc Huynh
    • 1
    Email author
  • Arridhana Ciptadi
    • 1
  • Ambrish Tyagi
    • 1
  • Amit Agrawal
    • 1
  1. 1.Amazon Lab126SunnyvaleUSA

Personalised recommendations