Abstract
This study explored the viability of out-the-box, pre-trained ConvNet models as a tool to generate features for large-scale classification tasks. A juxtaposition with generative methods for vocabulary generation was drawn. Both methods were chosen in an attempt to integrate other datasets (transfer learning) and unlabelled data, respectively. Both methods were used together, studying the viability of a ConvNet model to estimate category labels of unlabelled images. All experiments pertaining to this study were carried out over a two-class set, later expanded into a 5-category dataset. The pre-trained models used were obtained from the Caffe Model Zoo.
The study showed that the pre-trained model achieved best results for the binary dataset, with an accuracy of 0.945. However, for the 5-class dataset, generative vocabularies outperformed the ConvNet (0.91 vs. 0.861). Furthermore, when replacing labelled images with unlabelled ones during training, acceptable accuracy scores were obtained (as high as 0.903). Additionally, it was observed that linear kernels perform particularly well when utilized with generative models. This was especially relevant when compared to ConvNets, which require days of training even when utilizing multiple GPUs for computations.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional Architecture for Fast Feature Embedding (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25(2), 1–9 (2012)
Bosch, A., Zisserman, A., Munoz, X.: Scene classification using a hybrid generative/discriminative approach. IEEE Trans. Pattern Anal. Mach. Intell. 30, 712–727 (2008)
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR 2009 (2009)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2169–2178 (2006)
Sanchez, J., Perronnin, F., Mensink, T.: Image classification with the fisher vector: theory and practice. In: CVPR 2013 (2013)
Lee, C., Chiang, K.: Latent semantic analysis for classifying scene images. In: Proceedings of the International MultiConference of Engineers and Computer Scientists, vol. 2, pp. 17–20 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Michael, J., Teixeira, L.F. (2017). Pre-trained Convolutional Networks and Generative Statistical Models: A Comparative Study in Large Datasets. In: Alexandre, L., Salvador Sánchez, J., Rodrigues, J. (eds) Pattern Recognition and Image Analysis. IbPRIA 2017. Lecture Notes in Computer Science(), vol 10255. Springer, Cham. https://doi.org/10.1007/978-3-319-58838-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-58838-4_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58837-7
Online ISBN: 978-3-319-58838-4
eBook Packages: Computer ScienceComputer Science (R0)