Visual Product Recommendation Using Neural Aggregation Network and Context Gating
- 54 Downloads
In this paper, we focus on the problem of user interests’ classification in visual product recommender systems. We propose a two-stage procedure. At first, the image features are learned by fine-tuning the convolutional neural network, e.g., MobileNet. In the second stage, we use such learnable pooling techniques as neural aggregation network and context gating in order to compute a weighted average of image features. As a result, we can capture the relationships between the products’ images purchased by the same user. We provide an experimental study with the Amazon product dataset. It was shown that our approach achieves an F1-measure of 0.90 for 15 recommendations, which is much higher when compared to 0.66 F1-measure classifications of traditional averaging of the feature vector.
KeywordsVisual recommender system Deep convolutional neural networks Learnable pooling Neural aggregation network Context gating
The article was prepared within the framework of the Academic Fund Program at the National Research University Higher School of Economics (HSE University) in 2019 (Grant No. 19-04-004) and within the framework of the Russian Academic Excellence Project “5-100”.
- 1.Hidasi, B., Quadrana, M., Karatzoglou, A., Tikk, D.: Parallel recurrent neural network architectures for feature-rich session-based recommendations. In: Proceedings of the 10th ACM Conference on Recommender Systems, pp. 241–248 (2016)Google Scholar
- 2.Shankar, D., Narumanchi, S., Ananya, H.A., Kompalli, P., Chaudhury, K.: Deep learning based large scale visual recommendation and search for e-commerce (2017). arXiv preprint arXiv:1703.02344
- 3.Andreeva, E., Ignatov, D.I., Grachev, A., Savchenko, A.V.: Extraction of visual features for recommendation of products via deep learning. In: van der Aalst W. et al. (eds.) Analysis of Images, Social Networks and Texts. AIST 2018. Lecture Notes in Computer Science, vol. 11179, pp. 201–210. Springer, Cham (2018)Google Scholar
- 4.Zhai, A., Kislyuk, D., Jing, Y., Feng, M., Tzeng, E., Donahue, J., Du, Y.L., Darrell, T.: Visual discovery at pinterest. In: Proceedings of the 26th International Conference on World Wide Web Companion. International World Wide Web Conferences Steering Committee, pp. 515–524 (2017)Google Scholar
- 5.Yang, J., Ren, P., Zhang, D., Chen, D., Wen, F., Li, H., Hua, G.: Neural aggregation network for video face recognition. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR), pp. 4362–4371 (2017)Google Scholar
- 6.Miech, A., Laptev, I., Sivic, J.: Learnable pooling with Context Gating for video classification (2017). arXiv preprint arXiv:1706.06905
- 7.Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 1097–1105 (2012)Google Scholar
- 9.Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Adam, H.: Mobilenets: efficient convolutional neural networks for mobile vision applications (2017). arXiv preprint arXiv:1704.04861
- 10.McAuley, J., Targett, C., Shi, Q., Van Den Hengel, A.: Image-based recommendations on styles and substitutes. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 43–52 (2015)Google Scholar
- 12.Zhou, Y., Wilkinson, D., Schreiber, R., Pan, R.: Large-scale parallel collaborative filtering for the Netflix prize. In: International Conference on Algorithmic Applications in Management, pp. 337–348 (2008)Google Scholar