Unsupervised Deep Clustering for Fashion Images

Yan, Cairong; Malhi, Umar Subhan; Huang, Yongfeng; Tao, Ran

doi:10.1007/978-3-030-21451-7_8

Cairong Yan¹⁰,
Umar Subhan Malhi¹⁰,
Yongfeng Huang¹⁰ &
…
Ran Tao¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1027))

Included in the following conference series:

International Conference on Knowledge Management in Organizations

2456 Accesses
1 Citations

Abstract

In many visual domains like fashion, building an effective unsupervised clustering model depends on visual feature representation instead of structured and semi-structured data. In this paper, we propose a fashion image deep clustering (FiDC) model which includes two parts, feature representation and clustering. The fashion images are used as the input and are processed by a deep stacked autoencoder to produce latent feature representation, and the output of this autoencoder will be used as the input of the clustering task. Since the output of the former has a great influence on the later, the strategy adopted in the model is to integrate the learning process of the autoencoder and the clustering together. The autoencoder is trained with the optimal number of neurons per hidden layers to avoid overfitting and we optimize the cluster centroid by using stochastic gradient descent and backpropagation algorithm. We evaluate FiDC model on a real-world fashion dataset downloaded from Amazon where images have been extracted into 4096-dimensional visual feature vectors by convolutional neural networks. The experimental results show that our model achieves state-of-the-art performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Kiapour, M.H., Han, X., Lazebnik, S., Berg, A.C., Berg, T.L.: Where to buy it: matching street clothing photos in online shops. In: ICCV, Santiago (2015)
Google Scholar
Chen, Q., Huang, J., Feris, R., Brown, L.M., Dong, J., Yan, S.: Deep domain adaptation for describing people based on fine-grained clothing attributes. In: CVPR, Boston (2015)
Google Scholar
Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: DeepFashion: powering robust clothes recognition and retrieval with rich annotations. In: CVPR, Las Vegas (2016)
Google Scholar
Yamaguchi, K., Kiapour, M.H., Ortiz, L.E., Berg, T. L.: Parsing clothing in fashion photographs. In: CCPR, pp. 3570–3577 (2012)
Google Scholar
Yamaguchi, K., Kiapour, M.H., Ortiz, L.E., Berg, T.L.: Retrieving similar styles to parse clothing. IEEE Trans. Pattern Anal. Mach. Intell. 37(5), 1028–1040 (2015)
Article Google Scholar
Al-Halah, Z., Stiefelhagen, R., Grauman, K.: Fashion forward: forecasting visual style in fashion. In: IEEE International Conference on Computer Vision (ICCV), Venice, Italy (2017)
Google Scholar
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967)
Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)
MATH Google Scholar
Eamonn, K., Abdullah, M.: Curse of dimensionality. In: Liu, L., Özsu, M.T. (eds.) Encyclopedia of Machine Learning and Data Mining. Springer, Boston (2017). https://doi.org/10.1007/978-0-387-39940-9
Chapter Google Scholar
Hinton, E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)
Article MathSciNet Google Scholar
Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: International Conference on Machine Learning, New York (2017)
Google Scholar
Guo, X., Gao, L., Liu, X., Yin, J.: Improved deep embedded clustering with local structure preservation. In: IJCAI (2017)
Google Scholar
Yang, B., Fu, X., Sidiropoulos, N.D., Hong, M.: Towards k-means-friendly spaces: simultaneous deep learning and clustering. In ICML (2016)
Google Scholar
Tian, F., Gao, B., Cui, Q., Chen, E., Liu, T.-Y.: Learning deep representations for graph clustering. In: AAAI (2014)
Google Scholar
Peng, X., Xiao, S., Feng, J., Yau, W.-Y., Yi, Z.: Deep subspace clustering with sparsity prior. In: IJCAI (2016)
Google Scholar
Hsu, C.-C., Lin, C.-W.: Cnn-based joint clustering and representation learning with feature drift compensation for large-scale image data. IEEE Trans. Multimed. 20(2), 421–429 (2017)
Article Google Scholar
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.-A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)
MathSciNet MATH Google Scholar
Chowdhury, A.M.S., Rahman, M.S., Khanom, A., Chowdhury, T. I., Uddin, A.: On stacked denoising autoencoder based pre-training of ANN for isolated handwritten Bengali numerals dataset recognition. In: ICERIE, Sylhet (2017)
Google Scholar
Krizhevsky, A., Hinton, G.E.: Using very deep autoencoders for content-based image retrieval. In: ESANN (2011)
Google Scholar
Sarle, W.S.: Stopped training and other remedies for overfitting. In: Proceedings of the 27th Symposium on the Interface of Computing Science and Statistics (1995)
Google Scholar
Hinton, G., Salakhutdinov, R.: Learning a non-linear embedding by preserving class neighbourhood structure. In: International Conference on Artificial Intelligence and Statistics (2007). http://proceedings.mlr.press/v2/salakhutdinov07a.html
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: AISTATS (2011)
Google Scholar
van der Maaten, L., Hinton, G.E.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
MATH Google Scholar
McAuley, J., Targett, C., Shi, Q., van den Hengel, A.: Image-based recommendations on styles and substitutes. In: SIGIR, New York (2015)
Google Scholar
Shelhamer, E., Donahue, J., Jia, Y., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: ACM (2014)
Google Scholar
Krizhevsky, A., Sutskever,I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012)
Google Scholar
He, R., McAuley, J.: Ups and downs: modeling the visual evolution of fashion trends with one-class collaborative filtering. In: IW3C2 (2016)
Google Scholar
Cai, D., He, X., Han, J.: Locally consistent concept factorization for document clustering. IEEE Trans. Knowl. Data Eng. 23(6), 902–913 (2011)
Article Google Scholar
Santos, J.M., Embrechts, M.: On the use of the adjusted rand index as a metric for evaluating supervised classification. In: Alippi, C., Polycarpou, M., Panayiotou, C., Ellinas, G. (eds.) ICANN 2009. LNCS, vol. 5769, pp. 175–184. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04277-5_18
Chapter Google Scholar

Download references

Acknowledgments

This research was partly funded by the National Natural Science Foundation of China (No. 61402100).

Author information

Authors and Affiliations

School of Computer Science and Technology, Donghua University, Shanghai, China
Cairong Yan, Umar Subhan Malhi, Yongfeng Huang & Ran Tao

Authors

Cairong Yan
View author publications
You can also search for this author in PubMed Google Scholar
Umar Subhan Malhi
View author publications
You can also search for this author in PubMed Google Scholar
Yongfeng Huang
View author publications
You can also search for this author in PubMed Google Scholar
Ran Tao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cairong Yan .

Editor information

Editors and Affiliations

University of Staffordshire, Stoke-on-Trent, UK
Lorna Uden
National University of Kaohsiung, Kaohsiung, Taiwan
I-Hsien Ting
University of Salamanca, Salamanca, Spain
Juan Manuel Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yan, C., Malhi, U.S., Huang, Y., Tao, R. (2019). Unsupervised Deep Clustering for Fashion Images. In: Uden, L., Ting, IH., Corchado, J. (eds) Knowledge Management in Organizations. KMO 2019. Communications in Computer and Information Science, vol 1027. Springer, Cham. https://doi.org/10.1007/978-3-030-21451-7_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-21451-7_8
Published: 12 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21450-0
Online ISBN: 978-3-030-21451-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics