Abstract
In many visual domains like fashion, building an effective unsupervised clustering model depends on visual feature representation instead of structured and semi-structured data. In this paper, we propose a fashion image deep clustering (FiDC) model which includes two parts, feature representation and clustering. The fashion images are used as the input and are processed by a deep stacked autoencoder to produce latent feature representation, and the output of this autoencoder will be used as the input of the clustering task. Since the output of the former has a great influence on the later, the strategy adopted in the model is to integrate the learning process of the autoencoder and the clustering together. The autoencoder is trained with the optimal number of neurons per hidden layers to avoid overfitting and we optimize the cluster centroid by using stochastic gradient descent and backpropagation algorithm. We evaluate FiDC model on a real-world fashion dataset downloaded from Amazon where images have been extracted into 4096-dimensional visual feature vectors by convolutional neural networks. The experimental results show that our model achieves state-of-the-art performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kiapour, M.H., Han, X., Lazebnik, S., Berg, A.C., Berg, T.L.: Where to buy it: matching street clothing photos in online shops. In: ICCV, Santiago (2015)
Chen, Q., Huang, J., Feris, R., Brown, L.M., Dong, J., Yan, S.: Deep domain adaptation for describing people based on fine-grained clothing attributes. In: CVPR, Boston (2015)
Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: DeepFashion: powering robust clothes recognition and retrieval with rich annotations. In: CVPR, Las Vegas (2016)
Yamaguchi, K., Kiapour, M.H., Ortiz, L.E., Berg, T. L.: Parsing clothing in fashion photographs. In: CCPR, pp. 3570–3577 (2012)
Yamaguchi, K., Kiapour, M.H., Ortiz, L.E., Berg, T.L.: Retrieving similar styles to parse clothing. IEEE Trans. Pattern Anal. Mach. Intell. 37(5), 1028–1040 (2015)
Al-Halah, Z., Stiefelhagen, R., Grauman, K.: Fashion forward: forecasting visual style in fashion. In: IEEE International Conference on Computer Vision (ICCV), Venice, Italy (2017)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)
Eamonn, K., Abdullah, M.: Curse of dimensionality. In: Liu, L., Özsu, M.T. (eds.) Encyclopedia of Machine Learning and Data Mining. Springer, Boston (2017). https://doi.org/10.1007/978-0-387-39940-9
Hinton, E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)
Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: International Conference on Machine Learning, New York (2017)
Guo, X., Gao, L., Liu, X., Yin, J.: Improved deep embedded clustering with local structure preservation. In: IJCAI (2017)
Yang, B., Fu, X., Sidiropoulos, N.D., Hong, M.: Towards k-means-friendly spaces: simultaneous deep learning and clustering. In ICML (2016)
Tian, F., Gao, B., Cui, Q., Chen, E., Liu, T.-Y.: Learning deep representations for graph clustering. In: AAAI (2014)
Peng, X., Xiao, S., Feng, J., Yau, W.-Y., Yi, Z.: Deep subspace clustering with sparsity prior. In: IJCAI (2016)
Hsu, C.-C., Lin, C.-W.: Cnn-based joint clustering and representation learning with feature drift compensation for large-scale image data. IEEE Trans. Multimed. 20(2), 421–429 (2017)
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.-A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)
Chowdhury, A.M.S., Rahman, M.S., Khanom, A., Chowdhury, T. I., Uddin, A.: On stacked denoising autoencoder based pre-training of ANN for isolated handwritten Bengali numerals dataset recognition. In: ICERIE, Sylhet (2017)
Krizhevsky, A., Hinton, G.E.: Using very deep autoencoders for content-based image retrieval. In: ESANN (2011)
Sarle, W.S.: Stopped training and other remedies for overfitting. In: Proceedings of the 27th Symposium on the Interface of Computing Science and Statistics (1995)
Hinton, G., Salakhutdinov, R.: Learning a non-linear embedding by preserving class neighbourhood structure. In: International Conference on Artificial Intelligence and Statistics (2007). http://proceedings.mlr.press/v2/salakhutdinov07a.html
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: AISTATS (2011)
van der Maaten, L., Hinton, G.E.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
McAuley, J., Targett, C., Shi, Q., van den Hengel, A.: Image-based recommendations on styles and substitutes. In: SIGIR, New York (2015)
Shelhamer, E., Donahue, J., Jia, Y., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: ACM (2014)
Krizhevsky, A., Sutskever,I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012)
He, R., McAuley, J.: Ups and downs: modeling the visual evolution of fashion trends with one-class collaborative filtering. In: IW3C2 (2016)
Cai, D., He, X., Han, J.: Locally consistent concept factorization for document clustering. IEEE Trans. Knowl. Data Eng. 23(6), 902–913 (2011)
Santos, J.M., Embrechts, M.: On the use of the adjusted rand index as a metric for evaluating supervised classification. In: Alippi, C., Polycarpou, M., Panayiotou, C., Ellinas, G. (eds.) ICANN 2009. LNCS, vol. 5769, pp. 175–184. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04277-5_18
Acknowledgments
This research was partly funded by the National Natural Science Foundation of China (No. 61402100).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Yan, C., Malhi, U.S., Huang, Y., Tao, R. (2019). Unsupervised Deep Clustering for Fashion Images. In: Uden, L., Ting, IH., Corchado, J. (eds) Knowledge Management in Organizations. KMO 2019. Communications in Computer and Information Science, vol 1027. Springer, Cham. https://doi.org/10.1007/978-3-030-21451-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-21451-7_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21450-0
Online ISBN: 978-3-030-21451-7
eBook Packages: Computer ScienceComputer Science (R0)