Advertisement

Street2Fashion2Shop: Enabling Visual Search in Fashion e-Commerce Using Studio Images

  • Julia LasserreEmail author
  • Christian Bracher
  • Roland Vollgraf
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11351)

Abstract

Visual search, in particular the street-to-shop task of matching fashion items displayed in everyday images with similar articles, is a challenging and commercially important task in computer vision. Building on our successful Studio2Shop model [20], we report results on Street2Fashion2Shop, a pipeline architecture that stacks Studio2Fashion, a segmentation model responsible for eliminating the background in a street image, with Fashion2Shop, an improved model matching the remaining foreground image with “title images”, front views of fashion articles on a white background. Both segmentation and product matching rely on deep convolutional neural networks. The pipeline allows us to circumvent the lack of quality annotated wild data by leveraging specific data sets at all steps. We show that the use of fashion-specific training data leads to superior performance of the segmentation model. Studio2Shop built its performance on FashionDNA, an in-house product representation trained on the rich, professionally curated Zalando catalogue. Our study presents a substantially improved version of FashionDNA that boosts the accuracy of the matching model. Results on external datasets confirm the viability of our approach.

Keywords

Visual search Computer vision Deep learning Product matching Fashion 

Supplementary material

References

  1. 1.
    Cardoso, A., Daolio, F., Vargas, S.: Product characterisation towards personalisation: learning attributes from unstructured data to recommend fashion products. CoRR abs/1803.07679 (2018)Google Scholar
  2. 2.
    Bossard, L., Dantone, M., Leistner, C., Wengert, C., Quack, T., Van Gool, L.: Apparel classification with style. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part IV. LNCS, vol. 7727, pp. 321–335. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-37447-0_25CrossRefGoogle Scholar
  3. 3.
    Bracher, C., Heinz, S., Vollgraf, R.: Fashion DNA: merging content and sales data for recommendation and article mapping. CoRR abs/1609.02489 (2016)Google Scholar
  4. 4.
    Chen, H., Gallagher, A., Girod, B.: Describing clothing by semantic attributes. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 609–623. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33712-3_44CrossRefGoogle Scholar
  5. 5.
    Chen, Q., Huang, J., Feris, R., Brown, L.M., Dong, J., Yan, S.: Deep domain adaptation for describing people based on fine-grained clothing attributes. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  6. 6.
    Di, W., Wah, C., Bhardwaj, A., Piramuthu, R., Sundaresan, N.: Style finder: fine-grained clothing style detection and retrieval. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2013)Google Scholar
  7. 7.
    Dong, J., Chen, Q., Xia, W., Huang, Z., Yan, S.: A deformable mixture parsing model with parselets. In: ICCV, pp. 3408–3415 (2013)Google Scholar
  8. 8.
    Fu, J., Wang, J., Li, Z., Xu, M., Lu, H.: Efficient clothing retrieval with semantic-preserving visual phrases. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part II. LNCS, vol. 7725, pp. 420–431. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-37444-9_33CrossRefGoogle Scholar
  9. 9.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the International Conference on Computer Vision (ICCV) (2017)Google Scholar
  10. 10.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR abs/1512.03385 (2015)Google Scholar
  11. 11.
    Heinz, S., Bracher, C., Vollgraf, R.: An LSTM-based dynamic customer model for fashion recommendation. In: Proceedings of the 1st Workshop on Temporal Reasoning in Recommender Systems (RecSys 2017), pp. 45–49 (2017)Google Scholar
  12. 12.
    Huang, J., Feris, R.S., Chen, Q., Yan, S.: Cross-domain image retrieval with a dual attribute-aware ranking network. In: IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 December 2015, pp. 1062–1070 (2015)Google Scholar
  13. 13.
    Jagadeesh, V., Piramuthu, R., Bhardwaj, A., Di, W., Sundaresan, N.: Large scale visual recommendations from street fashion images. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2014, pp. 1925–1934 (2014)Google Scholar
  14. 14.
    Jetchev, N., Bergmann, U.: The conditional analogy gan: swapping fashion articles on people images. In: The IEEE International Conference on Computer Vision (ICCV) Workshops, October 2017Google Scholar
  15. 15.
    Ji, X., Wang, W., Zhang, M., Yang, Y.: Cross-domain image retrieval with attention modeling. In: Proceedings of the 2017 ACM on Multimedia Conference, MM 2017, pp. 1654–1662 (2017)Google Scholar
  16. 16.
    Jing, Y., et al.: Visual search at pinterest. In: KDD, pp. 1889–1898 (2015)Google Scholar
  17. 17.
    Kalantidis, Y., Kennedy, L., Li, L.J.: Getting the look: clothing recognition and segmentation for automatic product suggestions in everyday photos. In: Proceedings of the 3rd ACM Conference on International Conference on Multimedia Retrieval, ICMR 2013, pp. 105–112 (2013)Google Scholar
  18. 18.
    Kiapour, M.H., Yamaguchi, K., Berg, A.C., Berg, T.L.: Hipster wars: discovering elements of fashion styles. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part I. LNCS, vol. 8689, pp. 472–488. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10590-1_31CrossRefGoogle Scholar
  19. 19.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems 25, pp. 1097–1105 (2012)Google Scholar
  20. 20.
    Lasserre, J., Rasch, K., Vollgraf, R.: Studio2shop: from studio photo shoots to fashion articles. In: International Conference on Pattern Recognition, Applications and Methods (ICPRAM) (2018)Google Scholar
  21. 21.
    Liang, X., et al.: Deep human parsing with active template regression. IEEE Trans. Pattern Anal. Mach. Intell. 37, 2402–2414 (2015)CrossRefGoogle Scholar
  22. 22.
    Liang, X., et al.: Human parsing with contextualized convolutional neural network. In: ICCV, pp. 1386–1394 (2015)Google Scholar
  23. 23.
    Liu, S., et al.: Hi, magic closet, tell me what to wear! In: Proceedings of the 20th ACM International Conference on Multimedia, MM 2012, pp. 619–628 (2012)Google Scholar
  24. 24.
    Liu, S., Song, Z., Liu, G., Xu, C., Lu, H., Yan, S.: Street-to-shop: cross-scenario clothing retrieval via parts alignment and auxiliary set. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3330–3337 (2012)Google Scholar
  25. 25.
    Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: DeepFashion: powering robust clothes recognition and retrieval with rich annotations. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  26. 26.
    Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-24574-4_28CrossRefGoogle Scholar
  27. 27.
    Shankar, D., Narumanchi, S., Ananya, H.A., Kompalli, P., Chaudhury, K.: Deep learning based large scale visual recommendation and search for e-commerce. CoRR abs/1703.02344 (2017)Google Scholar
  28. 28.
    Simo-Serra, E., Ishikawa, H.: Fashion style in 128 floats: joint ranking and classification using weak data for feature extraction. In: Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  29. 29.
    The Guardian (Zoe Wood): Asos app allows shoppers to snap up fashion (2017), https://www.theguardian.com/business/2017/jul/15/asos-app-allows-shoppers-to-snap-up-fashion
  30. 30.
    Vittayakorn, S., Yamaguchi, K., Berg, A.C., Berg, T.L.: Runway to realway: visual analysis of fashion. In: IEEE Winter Conference on Applications of Computer Vision, pp. 951–958 (2015)Google Scholar
  31. 31.
    Wang, N., Haizhou, A.: Who blocks who: simultaneous clothing segmentation for grouping images. In: Proceedings of the International Conference on Computer Vision, ICCV 2011 (2011)Google Scholar
  32. 32.
    Wang, X., Sun, Z., Zhang, W., Zhou, Y., Jiang, Y.G.: Matching user photos to online products with robust deep features. In: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, ICMR 2016, pp. 7–14 (2016)Google Scholar
  33. 33.
    Wang, X., Zhang, T.: Clothes search in consumer photos via color matching and attribute learning. In: Proceedings of the 19th ACM International Conference on Multimedia, MM 2011, pp. 1353–1356 (2011)Google Scholar
  34. 34.
    Yamaguchi, K., Kiapour, M.H., Berg, T.L.: Paper doll parsing: retrieving similar styles to parse clothing items. In: IEEE International Conference on Computer Vision, pp. 3519–3526 (2013)Google Scholar
  35. 35.
    Yamaguchi, K., Kiapour, M.H., Ortiz, L., Berg, T.: Parsing clothing in fashion photographs. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2012, pp. 3570–3577 (2012)Google Scholar
  36. 36.
    Yamaguchi, K., Okatani, T., Sudo, K., Murasaki, K., Taniguchi, Y.: Mix and match: joint model for clothing and attribute recognition. In: Proceedings of the British Machine Vision Conference (BMVC), pp. 51.1–51.12 (2015)Google Scholar
  37. 37.
    Yang, F., et al.: Visual search at ebay. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017, pp. 2101–2110 (2017)Google Scholar
  38. 38.
    Yoo, D., Kim, N., Park, S., Paek, A.S., Kweon, I.S.: Pixel-level domain transfer. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 517–532. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46484-8_31CrossRefGoogle Scholar
  39. 39.
    Zheng, S., et al.: Conditional random fields as recurrent neural networks. In: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), ICCV 2015, pp. 1529–1537 (2015)Google Scholar
  40. 40.
    Zhu, S., Fidler, S., Urtasun, R., Lin, D., Loy, C.C.: Be your own prada: fashion synthesis with structural coherence. In: International Conference on Computer Vision (ICCV) (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Julia Lasserre
    • 1
    Email author
  • Christian Bracher
    • 1
  • Roland Vollgraf
    • 1
  1. 1.Zalando ResearchBerlinGermany

Personalised recommendations