Fully Convolutional Network with Superpixel Parsing for Fashion Web Image Segmentation

Yang, Lixuan; Rodriguez, Helena; Crucianu, Michel; Ferecatu, Marin

doi:10.1007/978-3-319-51811-4_12

Lixuan Yang^18,19,
Helena Rodriguez¹⁹,
Michel Crucianu¹⁸ &
…
Marin Ferecatu¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10132))

Included in the following conference series:

International Conference on Multimedia Modeling

3647 Accesses
3 Citations

Abstract

In this paper we introduce a new method for extracting deformable clothing items from still images by extending the output of a Fully Convolutional Neural Network (FCN) to infer context from local units (superpixels). To achieve this we optimize an energy function, that combines the large scale structure of the image with the local low-level visual descriptions of superpixels, over the space of all possible pixel labellings. To assess our method we compare it to the unmodified FCN network used as a baseline, as well as to the well-known Paper Doll and Co-parsing methods for fashion images.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bell, S., Upchurch, P., Snavely, N., Bala, K.: Material recognition in the wild with the materials in context database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3479–3487 (2015)
Google Scholar
Bengio, Y.: Deep learning of representations for unsupervised and transfer learning. In: JMLR W&CP: Proceedings of Unsupervised and Transfer Learning Challenge and Workshop, vol. 27, pp. 17–36 (2012)
Google Scholar
Chen, H., Gallagher, A., Girod, B.: Describing clothing by semantic attributes. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 609–623. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33712-3_44
Chapter Google Scholar
Chen, Q., Huang, J., Feris, R., Brown, L.M., Dong, J., Yan, S.: Deep domain adaptation for describing people based on fine-grained clothing attributes. In: CVPR, pp. 5315–5324. IEEE Computer Society, Boston (2015)
Google Scholar
Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: ideas, influences, and trends of the new age. ACM Comput. Surv. 40(2), 5:1–5:60 (2008)
Article Google Scholar
Deselaers, T., Keysers, D., Ney, H.: Features for image retrieval: an experimental comparison. Inf. Retr. 11(2), 77–107 (2008)
Article Google Scholar
Di, W., Wah, C., Bhardwaj, A., Piramuthu, R., Sundaresan, N.: Style finder: fine-grained clothing style detection and retrieval. In: Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2013, pp. 8–13. IEEE Computer Society, Washington, DC (2013)
Google Scholar
Dong, J., Chen, Q., Shen, X., Yang, J., Yan, S.: Towards unified human parsing and pose estimation. In: Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Washington, DC, USA, pp. 843–850 (2014)
Google Scholar
Eigen, D., Fergus, R.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture (2014). arXiv:abs/1411.4734
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The Pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Article Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Article Google Scholar
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vis. 59(2), 167–181 (2004)
Article Google Scholar
Hsu, E., Paz, C., Shen, S.: Clothing image retrieval for smarter shopping (Stanford project) (2011)
Google Scholar
Hu, Y., Yi, X., Davis, L.S.: Collaborative fashion recommendation: a functional tensor factorization approach. In: Proceedings of 23rd ACM International Conference on Multimedia, MM 2015, pp. 129–138. ACM, New York (2015)
Google Scholar
Jammalamadaka, N., Minocha, A., Singh, D., Jawahar, C.V.: Parsing clothes in unrestricted images. In: British Machine Vision Conference, BMVC 2013, Bristol, UK, 9–13 September 2013
Google Scholar
Kalantidis, Y., Kennedy, L., Li, L.J.: Getting the look: clothing recognition and segmentation for automatic product suggestions in everyday photos. In: Proceedings of 3rd ACM Conference on International Conference on Multimedia Retrieval, ICMR 2013, pp. 105–112. ACM, New York (2013)
Google Scholar
Kiapour, M.H., Han, X., Lazebnik, S., Berg, A.C., Berg, T.L.: Where to buy it: matching street clothing photos in online shops. In: Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV), ICCV 2015, pp. 3343–3351. IEEE Computer Society, Washington, DC (2015)
Google Scholar
King, I., Lau, T.K.: A feature-based image retrieval database for the fashion, textile, and clothing industry in Hong Kong. In: International Symposium on Multi-Technology Information Processing (ISMIP 1996), Hsin-Chu, Taiwan, pp. 233–240 (1996)
Google Scholar
Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with Gaussian edge potentials. In: Advances in Neural Information Processing Systems, vol. 24, pp. 109–117. Curran Associates Inc. (2011)
Google Scholar
Lagarias, J.C., Reeds, J.A., Wright, M.H., Wright, P.E.: Convergence properties of the Nelder-Mead simplex method in low dimensions. SIAM J. Optim. 9(1), 112–147 (1998)
Article MathSciNet MATH Google Scholar
Lew, M.S., Sebe, N., Djeraba, C., Jain, R.: Content-based multimedia information retrieval: state of the art and challenges. ACM Trans. Multimedia Comput. Commun. Appl. 2(1), 1–19 (2006)
Article Google Scholar
Chen, L.-C., George, P., Kokkinos, I., Murphy, K., Yuille, A.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: International Conference on Learning Representations, San Diego, United States, May 2015
Google Scholar
Lin, K., Yang, H.F., Liu, K.H., Hsiao, J.H., Chen, C.S.: Rapid clothing retrieval via deep learning of binary codes and hierarchical search. In: Proceedings of 5th ACM on International Conference on Multimedia Retrieval, New York, USA, pp. 499–502 (2015)
Google Scholar
Liu, S., Feng, J., Song, Z., Zhang, T., Lu, H., Xu, C., Yan, S.: Hi, magic closet, tell me what to wear!. In: Proceedings of 20th ACM International Conference on Multimedia, MM 2012, pp. 619–628. ACM, New York (2012)
Google Scholar
Liu, S., Liang, X., Liu, L., Lu, K., Lin, L., Yan, S.: Fashion parsing with video context. In: Proceedings of 22nd ACM International Conference on Multimedia, MM 2014, pp. 467–476. ACM, New York (2014)
Google Scholar
Liu, S., Song, Z., Wang, M., Xu, C., Lu, H., Yan, S.: Street-to-shop: cross-scenario clothing retrieval via parts alignment and auxiliary set. In: Proceedings of 20th ACM International Conference on Multimedia, MM 2012, pp, 1335–1336. ACM, New York (2012)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation (2014). arXiv:abs/1411.4038
Mostajabi, M., Yadollahpour, P., Shakhnarovich, G.: Feedforward semantic segmentation with zoom-out features (2014). arXiv: abs/1412.0774
Nguyen, T.V., Liu, S., Ni, B., Tan, J., Rui, Y., Yan, S.: Sense beauty via face, dressing, and/or voice. In: Proceedings of 20th ACM International Conference on Multimedia, MM 2012, pp. 239–248. ACM, New York (2012)
Google Scholar
Redi, M.: Novel methods for semantic and aesthetic multimedia retrieval. Ph.D. thesis, Université de Nice, Sophia Antipolis (2013)
Google Scholar
Simo-Serra, E., Fidler, S., Moreno-Noguer, F., Urtasun, R.: A high performance CRF model for clothes parsing. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9005, pp. 64–81. Springer, Heidelberg (2015). doi:10.1007/978-3-319-16811-1_5
Google Scholar
Simo-Serra, E., Fidler, S., Moreno-Noguer, F., Urtasun, R.: Neuroaesthetics in fashion: modeling the perception of fashionability. In: CVPR (2015)
Google Scholar
Song, Z., Wang, M., Hua, X.S., Yan, S.: Predicting occupation via human clothing and contexts. In: Proceedings of the 2011 International Conference on Computer Vision, ICCV 2011, pp. 1084–1091. IEEE Computer Society, Washington, DC (2011)
Google Scholar
Veit, A., Kovacs, B., Bell, S., McAuley, J., Bala, K., Belongie, S.: Learning visual clothing style with heterogeneous dyadic co-occurrences. In: International Conference on Computer Vision (ICCV), Santiago, Chile (2015)
Google Scholar
Yamaguchi, K., Hadi, K., Luis, E., Tamara, L.B.: Retrieving similar styles to parse clothing. IEEE TPAMI 37, 1028–1040 (2015)
Article Google Scholar
Yamaguchi, K., Okatani, T., Sudo, K., Murasaki, K., Taniguchi, Y.: Mix and match: joint model for clothing and attribute recognition. In: Proceedings of British Machine Vision Conference (BMVC), pp. 51.1–51.12. BMVA Press, September 2015
Google Scholar
Yang, M., Yu, K.: Real-time clothing recognition in surveillance videos. In: ICIP, ICIP 2011, pp. 2937–2940. IEEE (2011)
Google Scholar
Zhang, N., Donahue, J., Girshick, R.B., Darrell, T.: Part-based R-CNNs for fine-grained category detection (2014). arXiv: abs/1407.3867
Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.H.S.: Conditional random fields as recurrent neural networks (2015). arXiv: abs/1502.03240

Download references

Author information

Authors and Affiliations

Conservatoire National des Arts et Metiers, 292 Rue Saint-Martin, 75003, Paris, France
Lixuan Yang, Michel Crucianu & Marin Ferecatu
Shopedia SAS, 16 Rue des Blancs Manteaux, 75004, Paris, France
Lixuan Yang & Helena Rodriguez

Authors

Lixuan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Helena Rodriguez
View author publications
You can also search for this author in PubMed Google Scholar
Michel Crucianu
View author publications
You can also search for this author in PubMed Google Scholar
Marin Ferecatu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lixuan Yang .

Editor information

Editors and Affiliations

CNRS–IRISA, Rennes, France
Laurent Amsaleg
Reykjavík University, Reykjavik, Iceland
Gylfi Þór Guðmundsson
Dublin City University, Dublin, Ireland
Cathal Gurrin
Reykjavik University, Reykjavik, Ireland
Björn Þór Jónsson
National Institute of Informatics, Tokyo, Japan
Shin’ichi Satoh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, L., Rodriguez, H., Crucianu, M., Ferecatu, M. (2017). Fully Convolutional Network with Superpixel Parsing for Fashion Web Image Segmentation. In: Amsaleg, L., Guðmundsson, G., Gurrin, C., Jónsson, B., Satoh, S. (eds) MultiMedia Modeling. MMM 2017. Lecture Notes in Computer Science(), vol 10132. Springer, Cham. https://doi.org/10.1007/978-3-319-51811-4_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-51811-4_12
Published: 31 December 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51810-7
Online ISBN: 978-3-319-51811-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics