Skip to main content

Fully Convolutional Network with Superpixel Parsing for Fashion Web Image Segmentation

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10132))

Included in the following conference series:

Abstract

In this paper we introduce a new method for extracting deformable clothing items from still images by extending the output of a Fully Convolutional Neural Network (FCN) to infer context from local units (superpixels). To achieve this we optimize an energy function, that combines the large scale structure of the image with the local low-level visual descriptions of superpixels, over the space of all possible pixel labellings. To assess our method we compare it to the unmodified FCN network used as a baseline, as well as to the well-known Paper Doll and Co-parsing methods for fashion images.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bell, S., Upchurch, P., Snavely, N., Bala, K.: Material recognition in the wild with the materials in context database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3479–3487 (2015)

    Google Scholar 

  2. Bengio, Y.: Deep learning of representations for unsupervised and transfer learning. In: JMLR W&CP: Proceedings of Unsupervised and Transfer Learning Challenge and Workshop, vol. 27, pp. 17–36 (2012)

    Google Scholar 

  3. Chen, H., Gallagher, A., Girod, B.: Describing clothing by semantic attributes. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 609–623. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33712-3_44

    Chapter  Google Scholar 

  4. Chen, Q., Huang, J., Feris, R., Brown, L.M., Dong, J., Yan, S.: Deep domain adaptation for describing people based on fine-grained clothing attributes. In: CVPR, pp. 5315–5324. IEEE Computer Society, Boston (2015)

    Google Scholar 

  5. Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: ideas, influences, and trends of the new age. ACM Comput. Surv. 40(2), 5:1–5:60 (2008)

    Article  Google Scholar 

  6. Deselaers, T., Keysers, D., Ney, H.: Features for image retrieval: an experimental comparison. Inf. Retr. 11(2), 77–107 (2008)

    Article  Google Scholar 

  7. Di, W., Wah, C., Bhardwaj, A., Piramuthu, R., Sundaresan, N.: Style finder: fine-grained clothing style detection and retrieval. In: Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2013, pp. 8–13. IEEE Computer Society, Washington, DC (2013)

    Google Scholar 

  8. Dong, J., Chen, Q., Shen, X., Yang, J., Yan, S.: Towards unified human parsing and pose estimation. In: Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Washington, DC, USA, pp. 843–850 (2014)

    Google Scholar 

  9. Eigen, D., Fergus, R.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture (2014). arXiv:abs/1411.4734

  10. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The Pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)

    Article  Google Scholar 

  11. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)

    Article  Google Scholar 

  12. Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vis. 59(2), 167–181 (2004)

    Article  Google Scholar 

  13. Hsu, E., Paz, C., Shen, S.: Clothing image retrieval for smarter shopping (Stanford project) (2011)

    Google Scholar 

  14. Hu, Y., Yi, X., Davis, L.S.: Collaborative fashion recommendation: a functional tensor factorization approach. In: Proceedings of 23rd ACM International Conference on Multimedia, MM 2015, pp. 129–138. ACM, New York (2015)

    Google Scholar 

  15. Jammalamadaka, N., Minocha, A., Singh, D., Jawahar, C.V.: Parsing clothes in unrestricted images. In: British Machine Vision Conference, BMVC 2013, Bristol, UK, 9–13 September 2013

    Google Scholar 

  16. Kalantidis, Y., Kennedy, L., Li, L.J.: Getting the look: clothing recognition and segmentation for automatic product suggestions in everyday photos. In: Proceedings of 3rd ACM Conference on International Conference on Multimedia Retrieval, ICMR 2013, pp. 105–112. ACM, New York (2013)

    Google Scholar 

  17. Kiapour, M.H., Han, X., Lazebnik, S., Berg, A.C., Berg, T.L.: Where to buy it: matching street clothing photos in online shops. In: Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV), ICCV 2015, pp. 3343–3351. IEEE Computer Society, Washington, DC (2015)

    Google Scholar 

  18. King, I., Lau, T.K.: A feature-based image retrieval database for the fashion, textile, and clothing industry in Hong Kong. In: International Symposium on Multi-Technology Information Processing (ISMIP 1996), Hsin-Chu, Taiwan, pp. 233–240 (1996)

    Google Scholar 

  19. Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with Gaussian edge potentials. In: Advances in Neural Information Processing Systems, vol. 24, pp. 109–117. Curran Associates Inc. (2011)

    Google Scholar 

  20. Lagarias, J.C., Reeds, J.A., Wright, M.H., Wright, P.E.: Convergence properties of the Nelder-Mead simplex method in low dimensions. SIAM J. Optim. 9(1), 112–147 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  21. Lew, M.S., Sebe, N., Djeraba, C., Jain, R.: Content-based multimedia information retrieval: state of the art and challenges. ACM Trans. Multimedia Comput. Commun. Appl. 2(1), 1–19 (2006)

    Article  Google Scholar 

  22. Chen, L.-C., George, P., Kokkinos, I., Murphy, K., Yuille, A.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: International Conference on Learning Representations, San Diego, United States, May 2015

    Google Scholar 

  23. Lin, K., Yang, H.F., Liu, K.H., Hsiao, J.H., Chen, C.S.: Rapid clothing retrieval via deep learning of binary codes and hierarchical search. In: Proceedings of 5th ACM on International Conference on Multimedia Retrieval, New York, USA, pp. 499–502 (2015)

    Google Scholar 

  24. Liu, S., Feng, J., Song, Z., Zhang, T., Lu, H., Xu, C., Yan, S.: Hi, magic closet, tell me what to wear!. In: Proceedings of 20th ACM International Conference on Multimedia, MM 2012, pp. 619–628. ACM, New York (2012)

    Google Scholar 

  25. Liu, S., Liang, X., Liu, L., Lu, K., Lin, L., Yan, S.: Fashion parsing with video context. In: Proceedings of 22nd ACM International Conference on Multimedia, MM 2014, pp. 467–476. ACM, New York (2014)

    Google Scholar 

  26. Liu, S., Song, Z., Wang, M., Xu, C., Lu, H., Yan, S.: Street-to-shop: cross-scenario clothing retrieval via parts alignment and auxiliary set. In: Proceedings of 20th ACM International Conference on Multimedia, MM 2012, pp, 1335–1336. ACM, New York (2012)

    Google Scholar 

  27. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation (2014). arXiv:abs/1411.4038

  28. Mostajabi, M., Yadollahpour, P., Shakhnarovich, G.: Feedforward semantic segmentation with zoom-out features (2014). arXiv: abs/1412.0774

  29. Nguyen, T.V., Liu, S., Ni, B., Tan, J., Rui, Y., Yan, S.: Sense beauty via face, dressing, and/or voice. In: Proceedings of 20th ACM International Conference on Multimedia, MM 2012, pp. 239–248. ACM, New York (2012)

    Google Scholar 

  30. Redi, M.: Novel methods for semantic and aesthetic multimedia retrieval. Ph.D. thesis, Université de Nice, Sophia Antipolis (2013)

    Google Scholar 

  31. Simo-Serra, E., Fidler, S., Moreno-Noguer, F., Urtasun, R.: A high performance CRF model for clothes parsing. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9005, pp. 64–81. Springer, Heidelberg (2015). doi:10.1007/978-3-319-16811-1_5

    Google Scholar 

  32. Simo-Serra, E., Fidler, S., Moreno-Noguer, F., Urtasun, R.: Neuroaesthetics in fashion: modeling the perception of fashionability. In: CVPR (2015)

    Google Scholar 

  33. Song, Z., Wang, M., Hua, X.S., Yan, S.: Predicting occupation via human clothing and contexts. In: Proceedings of the 2011 International Conference on Computer Vision, ICCV 2011, pp. 1084–1091. IEEE Computer Society, Washington, DC (2011)

    Google Scholar 

  34. Veit, A., Kovacs, B., Bell, S., McAuley, J., Bala, K., Belongie, S.: Learning visual clothing style with heterogeneous dyadic co-occurrences. In: International Conference on Computer Vision (ICCV), Santiago, Chile (2015)

    Google Scholar 

  35. Yamaguchi, K., Hadi, K., Luis, E., Tamara, L.B.: Retrieving similar styles to parse clothing. IEEE TPAMI 37, 1028–1040 (2015)

    Article  Google Scholar 

  36. Yamaguchi, K., Okatani, T., Sudo, K., Murasaki, K., Taniguchi, Y.: Mix and match: joint model for clothing and attribute recognition. In: Proceedings of British Machine Vision Conference (BMVC), pp. 51.1–51.12. BMVA Press, September 2015

    Google Scholar 

  37. Yang, M., Yu, K.: Real-time clothing recognition in surveillance videos. In: ICIP, ICIP 2011, pp. 2937–2940. IEEE (2011)

    Google Scholar 

  38. Zhang, N., Donahue, J., Girshick, R.B., Darrell, T.: Part-based R-CNNs for fine-grained category detection (2014). arXiv: abs/1407.3867

  39. Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.H.S.: Conditional random fields as recurrent neural networks (2015). arXiv: abs/1502.03240

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lixuan Yang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Yang, L., Rodriguez, H., Crucianu, M., Ferecatu, M. (2017). Fully Convolutional Network with Superpixel Parsing for Fashion Web Image Segmentation. In: Amsaleg, L., Guðmundsson, G., Gurrin, C., Jónsson, B., Satoh, S. (eds) MultiMedia Modeling. MMM 2017. Lecture Notes in Computer Science(), vol 10132. Springer, Cham. https://doi.org/10.1007/978-3-319-51811-4_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-51811-4_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-51810-7

  • Online ISBN: 978-3-319-51811-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics