Graph-based particular object discovery
- 4 Downloads
Abstract
Severe background clutter is challenging in many computer vision tasks, including large-scale image retrieval. Global descriptors, which are popular due to their memory and search efficiency, are especially prone to corruption by such a clutter. Eliminating the impact of the clutter on the image descriptor increases the chance of retrieving relevant images and prevents topic drift due to actually retrieving the clutter in the case of query expansion. In this work, we propose a novel salient region detection method. It captures, in an unsupervised manner, patterns that are both discriminative and common in the dataset. Saliency is based on a centrality measure of a nearest neighbor graph constructed from regional CNN representations of dataset images. The proposed method exploits recent CNN architectures trained for object retrieval to construct the image representation from the salient regions. We improve particular object retrieval on challenging datasets containing small objects.
Keywords
Image retrieval Unsupervised object discovery Image saliencyNotes
Acknowledgements
This work was supported by the OP VVV funded project CZ.02.1.01/0.0/0.0/16_019/0000765 “Research Center for Informatics.” The Tesla K40 used for this research was donated by the NVIDIA Corporation.
References
- 1.Arandjelović, R., Zisserman, A.: Visual vocabulary with a semantic twist. In: ACCV (2014)Google Scholar
- 2.Avrithis, Y., Kalantidis, Y.: Approximate gaussian mixtures for large scale vocabularies. In: ECCV, pp. 15–28. Springer (2012)Google Scholar
- 3.Azizpour, H., Razavian, A.S., Sullivan, J., Maki, A., Carlsson, S.: From generic to specific deep representations for visual recognition. arXiv preprint arXiv:1406.5774 (2014)
- 4.Babenko, A., Lempitsky, V.: Aggregating deep convolutional features for image retrieval. In: ICCV (2015)Google Scholar
- 5.Babenko, A., Slesarev, A., Chigorin, A., Lempitsky, V.: Neural codes for image retrieval. In: ECCV (2014)Google Scholar
- 6.Bagon, S., Brostovski, O., Galun, M., Irani, M.: Detecting and sketching the common. In: CVPR (2010)Google Scholar
- 7.Cho, M., Kwak, S., Schmid, C., Ponce, J.: Unsupervised object discovery and localization in the wild: part-based matching with bottom-up region proposals. In: CVPR (2015)Google Scholar
- 8.Chum, O., Matas, J.: Unsupervised discovery of co-occurrence in sparse high dimensional data. In: CVPR (2010)Google Scholar
- 9.Dong, W., Charikar, M., Li, K.: Efficient k-nearest neighbor graph construction for generic similarity measures. In: WWW (2011)Google Scholar
- 10.Gammeter, S., Bossard, L., Quack, T., Gool, L.V.: I know what you did last summer: Object-level auto-annotation of holiday snaps. In: ICCV (2009)Google Scholar
- 11.Gordo, A., Almazan, J., Revaud, J., Larlus, D.: Deep image retrieval: Learning global representations for image search. In: ECCV (2016)Google Scholar
- 12.Gordo, A., Almazan, J., Revaud, J., Larlus, D.: End-to-end learning of deep visual representations for image retrieval. arXiv preprint arXiv:1610.07940 (2016)
- 13.Hubbell, C.H.: An input-output approach to clique identification. Sociometry (1965)Google Scholar
- 14.Iscen, A., Avrithis, Y., Tolias, G., Furon, T., Chum, O.: Fast spectral ranking for similarity search. In: CVPR (2018)Google Scholar
- 15.Iscen, A., Tolias, G., Avrithis, Y., Furon, T., Chum, O.: Efficient diffusion on region manifolds: recovering small objects with compact cnn representations. In: CVPR (2017)Google Scholar
- 16.Jégou, H., Douze, M., Schmid, C.: Improving bag-of-features for large scale image search. IJCV 87(3), 316–336 (2010)CrossRefGoogle Scholar
- 17.Jeong, D.-J., Choo, S., Seo, W., Cho, N.I.: Regional deep feature aggregation for image retrieval. In: ICASSP (2017)Google Scholar
- 18.Jimenez, A., Alvarez, J.M., Giro-i Nieto, X.: Class-weighted convolutional features for visual instance search. In: BMVC (2017)Google Scholar
- 19.Kalantidis, Y., Mellina, C., Osindero, S.: Cross-dimensional weighting for aggregated deep convolutional features. In: arXiv (2015)Google Scholar
- 20.Katz, L.: A new status index derived from sociometric analysis. Psychometrika 18(1), 39–43 (1953)MathSciNetCrossRefzbMATHGoogle Scholar
- 21.Kim, G., Torralba, A.: Unsupervised detection of regions of interest using iterative link analysis. In: NIPS (2009)Google Scholar
- 22.Kim, J., Yoon, S.-E.: Regional attention based deep feature for image retrieval. In: BMVC (2018)Google Scholar
- 23.Knopp, J., Sivic, J., Pajdla, T.: Avoiding confusing features in place recognition. In: ECCV (2010)Google Scholar
- 24.Kwak, S., Cho, M., Laptev, I., Ponce, J., Schmid, C.: Unsupervised object discovery and tracking in video collections. In: CVPR (2015)Google Scholar
- 25.Laskar, Z., Kannala, J.: Context aware query image representation for particular object retrieval. In: Scandinavian Conference on Image Analysis (2017)Google Scholar
- 26.Mej, N.: Networks: An Introduction. Oxford University Press, Oxford (2010)Google Scholar
- 27.Mikolajczyk, K., Matas, J.: Improving descriptors for fast tree matching by optimal linear projection. In: CVPR (2007)Google Scholar
- 28.Mohedano, E., McGuinness, K., Giro-i Nieto, X., O’Connor, N.E.: Saliency weighted convolutional features for instance search. arXiv preprint arXiv:1711.10795 (2017)
- 29.Nocedal, J., Wright, S.: Numerical Optimization. Springer, Berlin (2006)zbMATHGoogle Scholar
- 30.Noh, H., Araujo, A., Sim, J., Weyand, T., Han, B.: Large-scale image retrieval with attentive deep local features. In: arXiv (2016)Google Scholar
- 31.Oliva, A., Torralba, A.: Building the gist of a scene: the role of global image features in recognition. Prog. Brain Res. 155, 23–36 (2006)CrossRefGoogle Scholar
- 32.Omercevic, D., Perko, R., Targhi, A.T., Eklundh, J.-O., Leonardis, A.: Vegetation segmentation for boosting performance of mser feature detector. In: Computer Vision Winter Workshop (2008)Google Scholar
- 33.Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web (1999)Google Scholar
- 34.Pang, S., Ma, J., Xue, J., Zhu, J., Ordonez, V.: Image retrieval using heat diffusion for deep feature aggregation. arXiv preprint arXiv:1805.08587 (2018)
- 35.Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR (2007)Google Scholar
- 36.Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: CVPR (2008)Google Scholar
- 37.Radenović, F., Iscen, A., Tolias, G., Avrithis, Y., Chum, O.: Revisiting oxford and paris: large-scale image retrieval benchmarking. In: CVPR (2018)Google Scholar
- 38.Radenović, F., Tolias, G., Chum, O.: CNN image retrieval learns from bow: unsupervised fine-tuning with hard examples. In: ECCV (2016)Google Scholar
- 39.Radenović, F., Tolias, G., Chum, O.: Fine-tuning cnn image retrieval with no human annotation. IEEE Trans. PAMI (2018)Google Scholar
- 40.Razavian, A.S., Sullivan, J., Carlsson, S., Maki, A.: Visual instance retrieval with deep convolutional networks. ITE Trans. Media. Technol. Appl. 4, 251–258 (2016)CrossRefGoogle Scholar
- 41.Rubinstein, M., Joulin, A., Kopf, J., Liu, C.: Unsupervised joint object discovery and segmentation in internet images. In: CVPR (2013)Google Scholar
- 42.Salvador, A., Giró-i Nieto, X., Marqués, F., Satoh, S.: Faster r-cnn features for instance search. In: CVPRW (2016)Google Scholar
- 43.Selvaraju, R.R., Das, A., Vedantam, R., Cogswell, M., Parikh, D., Batra, D.: Grad-CAM: Why did you say that? visual explanations from deep networks via gradient-based localization. arXiv preprint arXiv:1610.02391 (2016)
- 44.Shi, M., Avrithis, Y., Jegou, H.: Early burst detection for memory-efficient image retrieval. In: CVPR (2015)Google Scholar
- 45.Simeoni, O., Iscen, A., Tolias, G., Avrithis, Y., Chum, O.: Unsupervised object discovery for instance recognition. In: WACV (2018)Google Scholar
- 46.Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. ICLR (2014)Google Scholar
- 47.Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: ICCV (2003)Google Scholar
- 48.Song, J., He, T., Gao, L., Xu, X., Shen, H.T.: Deep region hashing for efficient large-scale instance search from images. In: arXiv (2017)Google Scholar
- 49.Tolias, G., Avrithis, Y., Jégou, H.: Image search with selective match kernels: aggregation across single and multiple images. IJCV (2016)Google Scholar
- 50.Tolias, G., Kalantidis, Y., Avrithis, Y.: Symcity: Feature selection by symmetry for large scale image retrieval. In: ACM Multimedia (2012)Google Scholar
- 51.Tolias, G., Sicre, R., Jégou, H.: Particular object retrieval with integral max-pooling of cnn activations. In: ICLR (2016)Google Scholar
- 52.Turcot, P., Lowe, D.G.: Better matching with fewer features: The selection of useful features in large database recognition problems. In: ICCVW (2009)Google Scholar
- 53.Vigna, S.: Spectral ranking. arXiv preprint arXiv:0912.0238 (2009)
- 54.Wang, S., Jiang, S.: Instre: a new benchmark for instance-level object retrieval and recognition. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 11, 37 (2015)Google Scholar
- 55.Zheng, L., Wang, S., Wang, J., Tian, Q.: Accurate image search with multi-scale contextual evidences. IJCV 120(1), 1–13 (2016)MathSciNetCrossRefGoogle Scholar
- 56.Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR (2016)Google Scholar
- 57.Zhou, D., Weston, J., Gretton, A., Bousquet, O., Schölkopf, B.: Ranking on data manifolds. In: NIPS (2003)Google Scholar
- 58.Zhu, Y., Wang, J., Xie, L., Zheng, L.: Attention-based pyramid aggregation network for visual place recognition. arXiv preprint arXiv:1808.00288 (2018)