Bag of Features vs Vector of Locally Aggregated Descriptors

Younas, Farkhunda; Baber, Junaid; Mahmood, Tahir; Farooq, Javeria; Bakhtyar, Maheen

doi:10.1007/978-3-319-56991-8_10

Farkhunda Younas⁵,
Junaid Baber⁶,
Tahir Mahmood⁷,
Javeria Farooq⁸ &
…
Maheen Bakhtyar⁶

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 16))

Included in the following conference series:

Proceedings of SAI Intelligent Systems Conference

2840 Accesses
1 Citations

Abstract

Image representation by set of local features are common and also state-of-the art for many applications such as image retrieval and image classification. A single image contains on average 2.5 k–3.0 k features. Searching the images based on local features are discriminative compared to global features at the cost of heavy computational overhead. Bag-of-Features (BoF), also known as bag-of-visual words, are used for feature quantization which makes searching local features feasible in very large databases at the cost of distinctiveness. Mostly, the vocabulary size in those applications is kept up-to 1 million. In this research study, we investigated the performance of Vector of Locally Aggregated Descriptors (VLAD) which is recently proposed as an alternative to BoF for different families of descriptor. The VLAD achieves similar or sometimes better performance when compared to BoF despite of limited vocabulary size. The performance of VLAD is mostly compared with BoF on gradient based descriptors in literature. In our experiments, we take gradient based descriptor, intensity based descriptor, and binary descriptor. Scale Invariant Feature Transform (SIFT), Local Intensity Order Pattern (LIOP) and BInarization of Gradient Orientation Histograms (BIGOH) are used to validate the performance of VLAD in parallel to BoF on famous benchmark dataset. VLAD outperforms BoF in gradient based family and intensity based family but non of these are feasible for binary descriptors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Yu, F.X., Ji, R., Tsai, M.-H., Ye, G., Chang, S.-F.: Weak attributes for large-scale image retrieval. In: International Conference on Computer Vision and Pattern Recognition, pp. 2949–2956 (2012)
Google Scholar
Gong, Y., Lazebnik, S., Gordo, A., Perronnin, F.: Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2916–2929 (2013)
Article Google Scholar
Lowe, D.G.: Object recognition from local scale-invariant features. In: International Conference on Computer Vision, vol. 2, pp. 1150–1157 (1999)
Google Scholar
Baber, J., Dailey, M.N., Satoh, S., Afzulpurkar, N., Bakhtyar, M.: BIG-OH: binarization of gradient orientation histograms. Image Vis. Comput. 32(11), 940–953 (2014)
Article Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Wu, Z., Ke, Q., Isard, M., Sun, J.: Bundling features for large scale partial-duplicate web image search. In: Computer Vision and Pattern Recognition, pp. 25–32 (2009)
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Computer Vision and Pattern Recognition, pp. 1–8 (2007)
Google Scholar
Jégou, H., Douze, M., Schmid, C.: Packing Bag-of-Features. In: International Conference on Computer Vision, pp. 2357–2364 (2009)
Google Scholar
Baber, J., Afzulpurkar, N., Satoh, S.: A framework for video segmentation using global and local features. Int. J. Pattern Recogn. Artif. Intell. 27(05) (2013)
Google Scholar
Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: International Conference on Computer Vision, pp. 1470–1477 (2003)
Google Scholar
Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: International Conference on Computer Vision and Pattern Recognition, pp. 3304–3311 (2010)
Google Scholar
Yuan, X., Yu, J., Qin, Z., Wan, T.: A SIFT-LBP image retrieval model based on bag of features. In: IEEE International Conference on Image Processing (2011)
Google Scholar
Wang, Z., Fan, B., Wu, F.: Local intensity order pattern for feature description. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 603–610. IEEE (2011)
Google Scholar
Yu, S., Jurie, F.: Improving image classification using semantic attributes. Int. J. Comput. Vis. 100(1), 59–77 (2012)
Article Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Article Google Scholar
Xiao, J., Hays, J., Ehinger, K., Oliva, A., Torralba, A., et al.: Sun database: large-scale scene recognition from abbey to zoo. In: International Conference on Computer Vision and Pattern Recognition, pp. 3485–3492 (2010)
Google Scholar
Baber, J., Satoh, S., Afzulpurkar, N., Keatmanee, C.: Bag of visual words model for videos segmentation into scenes. In: Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service, pp. 191–194 (2013)
Google Scholar
Hota, A.: Comparison of some bag-of-words models for image recognition. In: 2014 X International Symposium on Telecommunications (BIHTEL), pp. 1–5 (2014)
Google Scholar
Peng, X., Wang, L., Qiao, Y., Peng, Q.: Boosting VLAD with supervised dictionary learning and high-order statistics. In: European Conference on Computer, pp. 660–674 (2014)
Google Scholar
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: Computer Vision and Pattern Recognition, vol. 2, pp. 2161–2168 (2006)
Google Scholar
Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: European Conference on Computer, pp. 143–156 (2010)
Google Scholar
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)
Article Google Scholar
Adam, B.: Reliable feature matching across widely separated views. In: International Conference on Computer Vision and Pattern Recognition, pp. 774–781 (2000)
Google Scholar
Lindeberg, T., Gårding, J.: Shape-adapted smoothing in estimation of 3-D shape cues from affine deformations of local 2-D brightness structure. Image Vis. Comput. 15, 415–434 (1997)
Article Google Scholar
Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. Int. J. Comput. Vis. 60, 63–86 (2004)
Article Google Scholar
Malisiewicz, T., Gupta, A., Efros, A., et al.: Ensemble of exemplar-SVMs for object detection and beyond. In: International Conference on Computer Vision, pp. 89–96 (2011)
Google Scholar

Download references

Acknowledgment

This research work is supported by Higher Education Commission (HEC) of Pakistan, SBK women university, and university of Balochistan.

Author information

Authors and Affiliations

Department of Computer Science, Sardar Bahadur Khan Women’s University, Quetta, Pakistan
Farkhunda Younas
Department of Computer Science and Information Technology, University of Balochistan, Quetta, Pakistan
Junaid Baber & Maheen Bakhtyar
Department of Computer Science, COMSATS Institute of Infomation Technology, Islamabad, Pakistan
Tahir Mahmood
Department of Electronic Engineering, Balochistan University of Information Technology, Engineering and Management Sciences, Quetta, Pakistan
Javeria Farooq

Authors

Farkhunda Younas
View author publications
You can also search for this author in PubMed Google Scholar
Junaid Baber
View author publications
You can also search for this author in PubMed Google Scholar
Tahir Mahmood
View author publications
You can also search for this author in PubMed Google Scholar
Javeria Farooq
View author publications
You can also search for this author in PubMed Google Scholar
Maheen Bakhtyar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Junaid Baber .

Editor information

Editors and Affiliations

Faculty of Computing and Engineering, School of Computing and Mathematics, University of Ulster at Jordanstown, Newtownabbey, United Kingdom
Yaxin Bi
The Science and Information (SAI) Organization, Bradford, West Yorkshire, United Kingdom
Supriya Kapoor
The Science and Information (SAI) Organization, Bradford, West Yorkshire, United Kingdom
Rahul Bhatia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Younas, F., Baber, J., Mahmood, T., Farooq, J., Bakhtyar, M. (2018). Bag of Features vs Vector of Locally Aggregated Descriptors. In: Bi, Y., Kapoor, S., Bhatia, R. (eds) Proceedings of SAI Intelligent Systems Conference (IntelliSys) 2016. IntelliSys 2016. Lecture Notes in Networks and Systems, vol 16. Springer, Cham. https://doi.org/10.1007/978-3-319-56991-8_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-56991-8_10
Published: 23 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56990-1
Online ISBN: 978-3-319-56991-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics