Abstract
The Bag-of-features model has recently achieved great success in image categorisation and become the state of the art. Support vector machines (SVMs) have played an important role in this process. This chapter first introduces the fundamentals of the Bag-of-features model in image categorisation. Following that, it is focused on how the SVM classifiers are applied to this model. In particular, we show the novel kernels developed to compare images based on a variety of representations incurred by this model. Also, how the kernels are implicitly implemented or effectively approximated to work with linear SVMs is discussed. Through this chapter, we will see that the application of SVMs not only demonstrates its elegance and efficiency but also raises new research issues to stimulate the development of SVMs.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Barla, A., Odone, F., Verri, A.: Histogram intersection kernel for image classification. In: ICIP, vol. 3, pp. 513–516 (2003)
Bo, L., Sminchisescu, C.: Efficient match kernel between sets of features for visual recognition. In: Neutral Information Proceeding Systems, pp. 135–143 (2009)
Boughorbel, S., Tarel, J.-P., Fleuret, F.: Non-mercer kernels for svm object recognition. In: BMVC, pp. 1–10 (2004)
Boureau, Y., Ponce, J., LeCun, Y.: A theoretical analysis of feature pooling in vision algorithms. In: Proceedings of International Conference on Machine learning (ICML’10), pp. 111–118 (2010)
Boureau, Y.-L., Bach, F., LeCun, Y., Ponce, J.: Learning mid-level features for recognition.In: Computer Vision Pattern Recognition, pp. 2559–2566 (2010)
Boureau, Y.-L., Bach, F., LeCun, Y., Ponce, J.: Learning mid-level features for recognition. In: Computer Vision Pattern Recognition, pp. 2559–2566 (2010)
Chapelle, O., Haffner, P., Vapnik, V.: Support vector machines for histogram-based image classification. IEEE Trans. Neural Netw. 10(5), 1055–1064 (1999)
Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: ECCV, pp. 1–22 (2004)
Cuturi, M., Vert, J.-P.: Semigroup kernels on finite sets. In: Neutral Information Proceeding Systems, vol. 17, pp. 329–336 (2004)
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Computer Vision Pattern Recognition, pp. 248–255 (2009)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL visual object classes challenge (VOC2007) results. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html (2007)
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. Comput. Vis. Image Und. 106(1), 59–70, (2007)
Fowlkes, C., Belongie, S., Chung, F.R.K., Malik, J.: Spectral grouping using the nyström method. IEEE T. Pattern Anal. Mach. Intell. 26(2), 214–225, (2004)
Grauman, K., Darrell, T.: The pyramid match kernel: discriminative classification with sets of image features. In: International Conference on Computer Vision, vol. 2, pp. 1458–1465 (2005)
Griffin, G., Holub, A., Perona, P.: Caltech-256 Object Category Dataset. Tech. Report, California Institute of Technology (2007)
Hsieh, C.-J., Chang, K.-W., Lin, C.-J., Keerthi, S.S., Sundararajan, S.: A dual coordinate descent method for large-scale linear svm. In: Cohen, W.W., McCallum, A., Roweis, S.T. (eds.) ICML, vol. 307 of ACM International Conference Proceeding Series, pp. 408–415. ACM (2008)
Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. In: Neutral Information Proceeding Systems, pp. 487–493 (1998)
Jégou, H., Douze, M., Schmid, C.: On the burstiness of visual elements. In: Computer Vision and Pattern Recognition 2009, pp. 1169–1176 (2009)
Joachims, T.: A statistical learning model of text classification for support vector machines. In: SIGIR, pp. 128–136 (2001)
Juriem, F., Triggs, B.: Creating efficient codebooks for visual recognition. In: International Conference on Computer Vision, vol. 1, pp. 604–610 (2005)
Kondor, R., Jebara, T.: A kernel between sets of vectors. In: ICML, pp. 361–368 (2003)
Krapac, J., Verbeek, J., Jurie, F.: Modeling spatial layout with fisher vectors for image categorization. In: International Conference on Computer Vision, pp. 1487–1494 (2011)
Lazebnik, S., Raginsky, M.: Supervised learning of quantizer codebooks by information loss minimization. IEEE T. Pattern Anal. Mach. Intell. 31(7), 1294–1309 (2009)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Computer Vision and Pattern Recognition, vol. 2, pp. 2169–2178 (2006)
Leung, T.K., Malik, J.: Representing and recognizing the visual appearance of materials using three-dimensional textons. Int. J. Comput. Vision 43(1), 29–44 (2001)
Liu, L., Wang, L., Liu, X.: In defense of soft-assignment coding. In: International Conference on Computer Vision, pp. 2486–2493 (2011)
Liu, L., Wang, L., Shen, C.: A generalized probabilistic framework for compact codebook creation. In: Computer Vision and Pattern Recognition, pp. 1537–1544 (2011)
Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the International Conference on Computer Vision, pp. 1150–1157 (1999)
Lyu, S.: Mercer kernels for object recognition with local features. In: Computer Vision and Pattern Recognition, vol. 2, pp. 223–229 (2005)
Madsen, R.E., Kauchak, D., Elkan, C.: Modeling word burstiness using the dirichlet distribution. In: International Conference on Machine learning, pp. 545–552 (2005)
Maji, S., Berg, A.C., Malik, J.: Classification using intersection kernel support vector machines is efficient. In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), Anchorage, 24–26 June 2008, pp. 1–8. IEEE Computer Society (2008). http://dx.doi.org/10.1109/CVPR.2008.4587630
Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. Int. J. Comput. Vision 60(1), 63–86 (2004)
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Gool, L.V.: A comparison of affine region detectors. Int. J. Comput. Vision 65(1–2), 43–72 (2005)
Moosmann, F., Triggs, B., Jurie, F.: Fast discriminative visual codebooks using randomised clustering forests. In: Neutral Information Proceeding Systems, pp. 985–992 (2006)
Moreno, P.J., Ho, P., Vasconcelos, N.: A kullback-leibler divergence based kernel for svm classification in multimedia applications. In: Neutral Information Proceeding Systems (2003)
Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: Computer Vision and Pattern Recognition, vol. 2, pp. 2161–2168 (2006)
Parsana, M., Bhattacharya, S., Bhattacharyya, C., Ramakrishnan, K.R.: Kernels on attributed pointsets with applications. In: Platt et al.(eds.) Advances in Neural Information Processing Systems 20, Proceedings of the Twenty-First Annual Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, December 3–6, 2007. Curran Associates (2008)
Perronnin, F., Dance C.R.: Fisher kernels on visual vocabularies for image categorization. In: Computer Vision and Pattern Recognition, pp. 1–8 (2007)
Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: ECCV, vol. 4, pp. 143–156 (2010)
Perronnin, F., Sánchez, J., Liu, Y.: Large-scale image categorization with explicit data embedding. In: The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2010), San Francisco, 13–18 June 2010, pp. 2297–2304. IEEE (2010). http://dx.doi.org/10.1109/CVPR.2010.5539914
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Computer Vision and Pattern Recognition (2007)
Rahimi, A., Recht, B.: Random features for large-scale kernel machines. In: Platt et al. (eds.) Advances in Neural Information Processing Systems 20, Proceedings of the Twenty-First Annual Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, December 3–6, 2007. Curran Associates (2008)
Rubner, Y. Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vision 40(2), 99–121 (2000)
Schiele, B., Crowley, J.L.: Object recognition using multidimensional receptive field histograms. In: ECCV, vol. 1, pp. 610–619 (1996)
Shalev-Shwartz, S., Singer, Y., Srebro, N.: Pegasos: primal estimated sub-gradient solver for svm. In: ICML (2007)
Shashua, A., Hazan, T.: Algebraic set kernels with application to inference over local image representations. In: Neutral Information Proceeding Systems, pp. 1257–1264 (2004)
Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: Proceedings of the International Conference on Computer Vision, vol. 2, pp. 1470–1477 (2003)
Swain, M.J., Ballard, D.H.: Color indexing. Int. J. Comput. Vision 7(1), 11–32 (1991)
van Gemert, J., Geusebroek, J.-M., Veenman, C.J., Smeulders, A.W.M.: Kernel codebooks for scene categorization. In: ECCV, vol. 3, pp. 696–709 (2008)
Vedaldi, A., Zisserman, A.: Efficient additive kernels via explicit feature maps. IEEE Trans. Pattern Anal. Mach. Intell. 34(3), 480–492 (2012)
Wallraven, C., Caputo, B., Graf, A.B.A.: Recognition with local features: the kernel recipe. In: International Conference on Computer Vision, pp. 257–264 (2003)
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T.S., Gong, Y.: Locality-constrained linear coding for image classification. In: Computer Vision and Pattern Recognition, pp. 3360–3367 (2010)
Wang, L.: Toward a discriminative codebook: codeword selection across multi-resolution. In: Computer Vision and Pattern Recognition, pp. 1–8 (2007)
Williams, C., Seeger, M.: Using the nystrm method to speed up kernel machines. In: Neutral Information Proceeding Systems (2001)
Winn, J.M.: Criminisi, A.: Minka, T.P.: Object categorization by learned universal visual dictionary. In: International Conference on Computer Vision, pp. 1800–1807 (2005)
Wu, J., Rehg, J.M.: Beyond the euclidean distance: creating effective visual codebooks using the histogram intersection kernel. In: International Conference on Computer Vision, pp. 630–637 (2009)
Yang, J., Yu, K., Gong, Y., Huang, T.S.: Linear spatial pyramid matching using sparse coding for image classification. In: Computer Vision and Pattern Recognition, pp. 1794–1801 (2009)
Zhang, J. Marszalek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. Int. J. Comput. Vision 73(2), 213–238 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Wang, L., Liu, L., Zhou, L., Chan, K.L. (2014). Application of SVMs to the Bag-of-Features Model: A Kernel Perspective. In: Ma, Y., Guo, G. (eds) Support Vector Machines Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-02300-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-02300-7_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02299-4
Online ISBN: 978-3-319-02300-7
eBook Packages: EngineeringEngineering (R0)