Abstract
The article deals with the problem of image classification on a relatively small dataset. The training deep convolutional neural net from scratch requires a large amount of data. In many cases, the solution to this problem is to use the pretrained network on another big dataset (e.g. ImageNet) and fine-tune it on available data. In the article, we apply this approach to classify advertising banners images. Initially, we reset the weights of the last layer and change its size to match a number of classes in our dataset. Then we train all network, but the learning rate for the last layer is several times more than for other layers. We use Adam optimization algorithm with some modifications. Firstly, applying weight decay instead of L2 regularization (for Adam they are not same) improves the result. Secondly, the division learning rate on the maximum of gradients squares sum instead of just gradients squares sum makes the training process more stable. Experiments have shown that this approach is appropriate for classifying relatively small datasets. Used metrics and test time augmentation are discussed. Particularly we find that confusion matrix is very useful because it gives an understanding of how to modify the train set to increase model quality.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Karpathy, A.: Convolutional neural networks for visual recognition. https://cs231n.github.io/transfer-learning/. Accessed 1 Apr 2019
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2015). arXiv preprint, arXiv:1409.1556v6 [cs.CV]
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 770–778. IEEE, New Jersey (2016)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 2818–2826. IEEE, New Jersey (2016)
Smith, L.: Cyclical learning rates for training neural networks. In: IEEE Winter Conference on Applications of Computer Vision, WACV, pp. 464–472. IEEE, New Jersey (2017)
Gupta, A.: Super-convergence: very fast training of neural networks using large learning rates. https://towardsdatascience.com/https-medium-com-super-convergence-very-fast-training-of-neural-networks-using-large-learning-rates-decb689b9eb0. Accessed 10 Apr 2019
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization (2019). arXiv preprint, arXiv:1711.05101v3 [cs.LG]
Reddi, S., Kale, S., Kumar, S.: On the convergence of adam and beyond. In: International Conference on Learning Representations, ICLR, Vancouver, BC, Canada , pp. 186–208 (2018)
Ayhan, M., Berens, P.: Test-time data augmentation for estimation of heteroscedastic aleatoric uncertainty in deep neural networks. In: Medical Imaging with Deep Learing Conference, MIDL, Amsterdam, Netherlands, pp. 278–286 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Fedorenko, Y.S. (2020). The Simple Approach to Multi-label Image Classification Using Transfer Learning. In: Kryzhanovsky, B., Dunin-Barkowski, W., Redko, V., Tiumentsev, Y. (eds) Advances in Neural Computation, Machine Learning, and Cognitive Research III. NEUROINFORMATICS 2019. Studies in Computational Intelligence, vol 856. Springer, Cham. https://doi.org/10.1007/978-3-030-30425-6_24
Download citation
DOI: https://doi.org/10.1007/978-3-030-30425-6_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30424-9
Online ISBN: 978-3-030-30425-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)