Skip to main content

Composite Descriptors and Deep Features Based Visual Phrase for Image Retrieval

  • Conference paper
  • First Online:
  • 1464 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11068))

Abstract

Local descriptors are very effective features in bag-of-visual-words (BoW) and vector of locally aggregated descriptors (VALD) models for image retrieval. Different kinds of local descriptors represent different visual content. We recognize that spatial contextual information play an important role in image matching, image retrieval and image recognition. Therefore, to explore efficient features, firstly, a new local composite descriptor is proposed, which combines the advantages of SURF and color name (CN) information. Then, VLAD method is used to encode the proposed composite descriptors to a vector. Third, local deep features are extracted and fused with the encoded vector in the image block. Finally, to implement efficient retrieval system, a novel image retrieval framework is organized a novel image retrieval framework is organized based on the proposed feature fusion strategies. The proposed methods areis verified on three benchmark datasets, i.e., Holidays, Oxford5k and Ukbench. Experimental results show that our methods achieves good performance. Eespecially, the mAP and N-S score achieve 0.8281 and 3.5498 on Holidays and Ukbench datasets, respectively.

Supported by organization x.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Jégou, H., Douze, M., Schmid, C., et al.: Aggregating local descriptors into a compact image representation. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3304–3311. IEEE, San Francisco (2010). https://doi.org/10.1109/CVPR.2010.5540039

  2. Spyromitros-Xioufis, E.: A comprehensive study over VLAD and product quantization in large-scale image retrieval. IEEE Trans. Multimed. 16(6), 1713–1728 (2014)

    Article  Google Scholar 

  3. Spyromitros-Xioufis, E., Papadopoulos, S., Kompatsiaris, I.Y., et al.: An empirical study on the combination of surf features with VLAD vectors for image search. In: 13th International Workshop on Image Analysis for Multimedia Interactive Services, pp. 1–4. IEEE, Dublin (2012). https://doi.org/10.1109/WIAMIS.2012.6226771

  4. Alzu’bi, A., Amira, A., Ramzan, N., Jaber, T.: Robust fusion of color and local descriptors for image retrieval and classification. In: 2015 International Conference on Systems, Signals and Image Processing (IWSSIP), pp. 253–256. IEEE, London (2015). https://doi.org/10.1109/IWSSIP.2015.7314224

  5. Fan, P., Men, A., Chen, M., et al.: COLOR-SURF: a SURF descriptor with local kernel color histograms. In: IEEE International Conference on Network Infrastructure and Digital Content, pp. 726–730. IEEE, Beijing (2009). https://doi.org/10.1109/ICNIDC.2009.5360809

  6. Weijer, J.V.D., Schmid, C., Verbeek, J., Larlus, D.: Learning color names for real-world applications. IEEE Trans. Image Process. 18(7), 1512–1523 (2009)

    Article  MathSciNet  Google Scholar 

  7. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: 25th International Conference on Neural Information Processing Systems, pp. 1097–1105, Curran Associates Inc., Lake Tahoe (2012). https://doi.org/10.1145/3065386

    Article  Google Scholar 

  8. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53

    Chapter  Google Scholar 

  9. Razavian, A.S., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 512–519. IEEE, Columbus (2014). https://doi.org/10.1109/CVPRW.2014.131

  10. Babenko, A., Slesarev, A., Chigorin, A., Lempitsky, V.: Neural codes for image retrieval. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 584–599. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_38

    Chapter  Google Scholar 

  11. Jiang, Y., Meng, J., Yuan, J.: Randomized visual phrases for object search. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3100–3104. IEEE, Providence (2012). https://doi.org/10.1109/CVPR.2012.6248042

  12. Zheng, L., Wang, S., Liu, Z., Tian, Q.: Packing and padding: coupled multi-index for accurate image retrieval. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1947–1954. IEEE, Columbus (2014). https://doi.org/10.1109/CVPR.2014.250

  13. Arandjelovic, R.: Three things everyone should know to improve object retrieval. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, vol. 157, pp. 2911–2918. IEEE, Providence (2012). https://doi.org/10.1109/CVPR.2012.6248018

  14. Vigo, D.A.R., Khan, F.S., Weijer, J.V.D., Gevers, T.: The impact of color on bag-of-words based object recognition. In: 20th International Conference on Pattern Recognition, pp. 1549–1553. IEEE, Istanbul (2010). https://doi.org/10.1109/ICPR.2010.383

  15. Khan, F.S.: Modulating shape features by color attention for object recognition. Int. J. Comput. Vis. 98(1), 49–64 (2012)

    Article  Google Scholar 

  16. Bagdanov, A.D.: Color attributes for object detection. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, vol. 157, pp. 3306–3313. IEEE, Providence (2012). https://doi.org/10.1109/CVPR.2012.6248068

  17. Zhang, S.: Generating descriptive visual words and visual phrases for large-scale image applications. IEEE Trans. Image Process. 20(9), 2664–2677 (2011)

    Article  MathSciNet  Google Scholar 

  18. Cour, T., Zhu, S., Han, T.X.: Contextual weighting for vocabulary tree based image retrieval. In: 2011 International Conference on Computer Vision, vol. 23, pp. 209–216. IEEE, Barcelona (2011). https://doi.org/10.1109/ICCV.2011.6126244

  19. Liu, Z., Li, H., Zhou, W., Tian, Q.: Embedding spatial context information into inverted filefor large-scale image retrieval. In: 20th ACM International Conference on Multimedia, pp. 199–208. ACM, Nara (2012). https://doi.org/10.1145/2393347.2393380

  20. Perronnin, F., Liu, Y., Sánchez, J., Poirier, H.: Large-scale image retrieval with compressed Fisher vectors. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 26, pp. 3384–3391. IEEE, San Francisco (2010). https://doi.org/10.1109/CVPR.2010.5540009

  21. Jégou, H., Chum, O.: Negative evidences and co-occurences in image retrieval: the benefit of PCA and whitening. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, pp. 774–787. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33709-3_55

    Chapter  Google Scholar 

  22. Huiskes, M.J., Lew, M.S.: The MIR flickr retrieval evaluation. In: 1st ACM International Conference on Multimedia Information Retrieval, pp. 39–43. ACM, Vancouver (2008). https://doi.org/10.1145/1460096.1460104

  23. Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. (2014). https://doi.org/10.5244/C.28.6

  24. Jégou, H., Zisserman, A.: Triangulation embedding and democratic aggregation for image search. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3310–3317. IEEE, Columbus (2014). https://doi.org/10.1109/CVPR.2014.417

  25. Gong, Y., Wang, L., Guo, R., Lazebnik, S.: Multi-scale orderless pooling of deep convolutional activation features. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 392–407. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10584-0_26

    Chapter  Google Scholar 

  26. Dong, J., Soatto, S.: Domain-size pooling in local descriptors: DSP-SIFT, pp. 5097–5106. eprint arXiv:1412.8556 (2014)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yigang Cen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, Y., Zhang, L., Cen, Y., Zhao, R., Chai, T., Cen, Y. (2018). Composite Descriptors and Deep Features Based Visual Phrase for Image Retrieval. In: Sun, X., Pan, Z., Bertino, E. (eds) Cloud Computing and Security. ICCCS 2018. Lecture Notes in Computer Science(), vol 11068. Springer, Cham. https://doi.org/10.1007/978-3-030-00021-9_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00021-9_43

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00020-2

  • Online ISBN: 978-3-030-00021-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics