Large scale image retrieval with DCNN and local geometrical constraint model

  • Huabing ZhouEmail author
  • Yiwei Tao
  • Jinshu Shi
  • Xiaolin Li
  • Deng Chen
  • Yanduo Zhang
  • Liang Xie


Image retrieval, which refers to browse, search and retrieve the images of the same scene or object from a large database of digital images, has attracted increasing interests in recent years. This paper proposes a coarse-to-fine method for fast indexing with Deep Convolutional Neural Network(DCNN) and Local Geometrical Constraint Model. We first use a vector quantized DCNN feature descriptors and exploit enhanced Locality-sensitive hashing(LSH) techniques for fast coarse-grained retrieval. Then, we focus on obtaining high-precision preserved matches for fine-grained retrieval. This is formulated as a maximum likelihood estimation of a Bayesian model with latent variables indicating whether matches in the putative set are inliers or outliers. We impose the non-parametric global geometrical constraints on the correspondence using Tikhonov regularizers in a reproducing kernel Hilbert space. To ensure the well-posedness of the problem, we develop a local geometrical constraint that can preserve local structures among neighboring feature points, and it is also robust to a large number of outliers. The problem is solved by using the Expectation Maximization algorithm. Extensive experiments on real near-duplicate images for both feature matching and image retrieval demonstrate that the results of the proposed method outperform current state-of-the-art methods.


Image retrieval Coarse-to-fine DCNN Local geometrical constraint model 



  1. 1.
    Aronszajn N (1950) Theory of reproducing kernels. Trans Am Math Soc 68(3):337–404MathSciNetzbMATHCrossRefGoogle Scholar
  2. 2.
    Bishop CM (2006) Pattern recognition and machine learning. Springer, BerlinzbMATHGoogle Scholar
  3. 3.
    Chen J, Wang Y, Luo L, Yu J-G, Ma J (2016) Image retrieval based on image-to-class similarity. Pattern Recogn Lett 83:379–387CrossRefGoogle Scholar
  4. 4.
    Cheng Z, Shen J (2014) Just-for-me: an adaptive personalization system for location-aware social music recommendation. In: Proceedings of international conference on multimedia retrieval. ACM, pp 185Google Scholar
  5. 5.
    Cheng Z, Shen J (2016) On effective location-aware music recommendation. ACM Trans Inf Syst (TOIS) 34(2):13MathSciNetCrossRefGoogle Scholar
  6. 6.
    Cheng Z, Shen J, Miao H (2016) The effects of multiple query evidences on social image retrieval. Multimed Syst 22(4):509–523CrossRefGoogle Scholar
  7. 7.
    Chitrakar P, Zhang C, Warner G, Liao X (2016) Socai media image retrieval using distilled convolutional neural nerwok for suspicious e-crime and terrorist involvement detection. In: ISM, pp 493–498Google Scholar
  8. 8.
    Datar M, Immorlica N, Indyk P, Mirrokni V (2004) Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the 20th aunual symposium on Computational geometry, pp 253–262Google Scholar
  9. 9.
    Fischler MA, Bolles RC (1981) Random sample consensus: A paradigm for model fitting with application to image analysis and automated cartography. Commun ACM 24(6):381–395MathSciNetCrossRefGoogle Scholar
  10. 10.
    Ma J, Zhao J, Yuille AL (2016) Non-rigid point set registration by preserving global and local structures. IEEE Trans Image Process 25(1):53–64MathSciNetCrossRefGoogle Scholar
  11. 11.
    Gao Y, Ma J, Yuille AL (2017) Semi-supervised sparse representation based classification for face recognition with insufficient labeled samples. IEEE Trans Image Process 26(5):2545–2560MathSciNetCrossRefGoogle Scholar
  12. 12., Inc. (2018)
  13. 13.
    Jain AK, Vailaya A (1996) Image retrieval using color and shape. Pattern Recogn 29(8):1233–1244CrossRefGoogle Scholar
  14. 14.
    Jin Z, Li C, Lin Y, Cai D (2014) Density sensitive hashing. IEEE Trans Cybern 44(8):1362–1371CrossRefGoogle Scholar
  15. 15.
    Jinda-Apiraksa A, Vonikakis V, Winkler S (2013) California-ND: An annotated dataset for near-duplicate detection in personal photo collections. In: QoMEX, pp 142–147Google Scholar
  16. 16.
    Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NIPS, pp 1097–1105Google Scholar
  17. 17.
    LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRefGoogle Scholar
  18. 18.
    Li F, Perona P (2005) A bayesian hierarchical model for learning natural scene categories. In: CVPR, pp 524–531Google Scholar
  19. 19.
    Li X, Hu Z (2010) Rejecting mismatches by correspondence function. Int J Comput Vis 89(1):1–17CrossRefGoogle Scholar
  20. 20.
    Li J, Wu Y, Zhao J, Lu K (2016) Multi-manifold sparse graph embedding for multi-modal image classification. Neurocomputing 173:501–510CrossRefGoogle Scholar
  21. 21.
    Li J, Wu Y, Zhao J, Lu K (2017) Low-rank discriminant embedding for multiview learning. IEEE Trans Cybern 47(11):3516–3529CrossRefGoogle Scholar
  22. 22.
    Li J, Lu K, Huang Z, Zhu L, Shen HT (2018) Transfer independently together: A generalized framework for domain adaptation, IEEE Transactions on CyberneticsGoogle Scholar
  23. 23.
    Li J, Zhao J, Lu K (2016) Joint feature selection and structure preservation for domain adaptation. In: IJCAI, pp 1697–1703Google Scholar
  24. 24.
    Liu H, Yan S (2010) Common visual pattern discovery via spatially coherent correspondence. In: CVPR, pp 1609–1616Google Scholar
  25. 25.
    Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110MathSciNetCrossRefGoogle Scholar
  26. 26.
    Ma J, Zhao J, Jiang J, Zhou H, Guo X Locality preserving matching, International Journal of Computer Vision, to be published,
  27. 27.
    Ma J, Zhao J, Jiang J, Zhou H (2017) Non-rigid point set registration with robust transformation estimation under manifold regularization. In: AAAI, pp 4218–4224Google Scholar
  28. 28.
    Ma J, Jiang J, Liu C, Li Y (2017) Feature guided Gaussian mixture model with semi-supervised em and local geometric constraint for retinal image registration. Inform Sci 417:128–142MathSciNetCrossRefGoogle Scholar
  29. 29.
    Ma J, Zhao J, Tian J, Bai X, Tu Z (2013) Regularized vector field learning with sparse approximation for mismatch removal. Pattern Recogn 46(12):3519–3532zbMATHCrossRefGoogle Scholar
  30. 30.
    Ma J, Zhao J, Tian J, Yuille AL, Tu Z (2014) Robust point matching via vector field consensus. IEEE Trans Image Process 23(4):1706–1721MathSciNetzbMATHCrossRefGoogle Scholar
  31. 31.
    Ma J, Zhou H, Zhao J, Gao Y, Jiang J, Tian J (2015) Robust feature matching for remote sensing image registration via locally linear transforming. IEEE Trans Geosci Remote Sens 53(12):6469–6481CrossRefGoogle Scholar
  32. 32.
    Ma J, Jiang J, Zhou H, Zhao J, Guo X (2018) Guided locality preserving feature matching for remote sensing image registration. IEEE Trans Geosci Remote Sens 56(8):4435–4447Google Scholar
  33. 33.
    Peng L, Zhang Y, Zhou H, Jiang J, Ma J (2018) A non-parametric depth modification model for registration between color and depth images. Multidim Syst Sign Process. to be published,
  34. 34.
    Peng L, Zhang Y, Zhou H, Lu T (2018) A robust method for estimating image geometry with local structure constraint. IEEE Access 6:20734–20747CrossRefGoogle Scholar
  35. 35.
    Sivic J, Zisserman A (2003) Video google: A text retrieval approach to object matching in videos. In: ICCV, pp 1470–1477Google Scholar
  36. 36.
    Tang X, Jiao L (2017) Fusion similarity-based reranking for sar image retrieval. IEEE Geosci Remote Sens Lett 14(2):242–246CrossRefGoogle Scholar
  37. 37.
    Tikhonov AN, Arsenin VY (1977) Solutions of Ill-posed Problems. Winston, WashingtonzbMATHGoogle Scholar
  38. 38.
    Vapnik VN (1998) Statistical learning theory. Wiley, New YorkzbMATHGoogle Scholar
  39. 39.
    Vedaldi A, Fulkerson B (2010) VLFeat - An open and portable library of computer vision algorithms. In: ACM MM, pp 1469–1472Google Scholar
  40. 40.
    Wold S, Esbensen K, Geladi P (1987) Principal component analysis. Chemometr Intell Lab Syst 2:37–52CrossRefGoogle Scholar
  41. 41.
    Ma J, Zhao J, Ma Y, Tian J (2015) Non-rigid visible and infrared face registration via regularized Gaussian fields criterion. Pattern Recognition 48(3):772–784CrossRefGoogle Scholar
  42. 42.
    Yu Z, Zhou H, Li C (2017) Fast non-rigid image feature matching for agricultural uav via probabilistic inference with regularization techniques. Comput Electron Agric 143:79–89CrossRefGoogle Scholar
  43. 43.
    Yuille AL (1990) Generalized deformable models, statistical physics, and matching problems. Neural Comput 2(1):1–24MathSciNetCrossRefGoogle Scholar
  44. 44.
    Zhou H, Zhang DZ, Chen C, Tian J (2011) Discarding wide baseline mismatches with global and local transformation consistency. Electron Lett 47(1):25–26CrossRefGoogle Scholar
  45. 45.
    Zhou H, Gong J, Tian J (2012) Semi-calibrated downward-looking images epipolar rectification on airborne platform. Huazhong Keji Daxue Xuebao(Ziran Kexue Ban)/ J Huazhong Univ Sci Technol (Nat Sci Ed) 9:40Google Scholar
  46. 46.
    Zhou H, Ma J, Yang C, Sun S, Liu R, Zhao J (2016) Nonrigid feature matching for remote sensing images via probabilistic inference with global and local regularizations. IEEE Geosci Remote Sens Lett 13(3):374–378Google Scholar
  47. 47.
    Zhu L, Huang Z, Liu X, He X, Sun J, Zhou X (2017) Discrete multimodal hashing with canonical views for robust mobile landmark search. IEEE Trans Multimed 19(9):2066–2079CrossRefGoogle Scholar
  48. 48.
    Zhu L, Shen J, Jin H, Xie L, Zheng R (2015) “Landmark classification with hierarchical multi-modal exemplar feature”. IEEE Trans Multimed 17(7):981–993CrossRefGoogle Scholar
  49. 49.
    Zhu L, Shen J, Jin H, Zheng R, Xie L (2015) Content-based visual landmark search via multimodal hypergraph learning. IEEE Trans Cybern 45(12):2756–2769CrossRefGoogle Scholar
  50. 50.
    Zhu L, Shen J, Xie L (2017) Unsupervised visual hashing with semantic assistant for content-based image retrieval. IEEE Trans Knowl Data Eng 29(2):472–486CrossRefGoogle Scholar
  51. 51.
    Zhu L, Shen J, Xie L, Cheng Z (2017) Unsupervised topic hypergraph hashing for efficient mobile image retrieval. IEEE Trans Cybern 47(11):3941–3954CrossRefGoogle Scholar
  52. 52.
    Zhou H, Kuang Y, Yu Z, Ren S, Zhang Y, Lu T, Ma J (2018) Image deformation with vector-field interpolation based on mrls-tps. IEEE Access 6:75886–75898Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Hubei Key Laboratory of Intelligent RobotWuhan Institute of TechnologyWuhanChina
  2. 2.School of EngineeringThe Hong Kong University of Science and TechnologyClear Water Bay, KowloonChina
  3. 3.Department of MathematicsWuhan University of TechonologyWuhanChina

Personalised recommendations