Cross-Domain Image Matching with Deep Feature Maps


We investigate the problem of automatically determining what type of shoe left an impression found at a crime scene. This recognition problem is made difficult by the variability in types of crime scene evidence (ranging from traces of dust or oil on hard surfaces to impressions made in soil) and the lack of comprehensive databases of shoe outsole tread patterns. We find that mid-level features extracted by pre-trained convolutional neural nets are surprisingly effective descriptors for this specialized domains. However, the choice of similarity measure for matching exemplars to a query image is essential to good performance. For matching multi-channel deep features, we propose the use of multi-channel normalized cross-correlation and analyze its effectiveness. Our proposed metric significantly improves performance in matching crime scene shoeprints to laboratory test impressions. We also show its effectiveness in other cross-domain image retrieval problems: matching facade images to segmentation labels and aerial photos to map images. Finally, we introduce a discriminatively trained variant and fine-tune our system through our proposed metric, obtaining state-of-the-art performance.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9


  1. 1.

    Pretrained model was obtained from

  2. 2.

    Pretrained model was obtained from

  3. 3.

    Pretrained model was obtained from

  4. 4.

    Our code is available at


  1. Bodziak, W. J. (1999). Footwear impression evidence: Detection, recovery and examination. Boca Raton, FL: CRC Press.

    Google Scholar 

  2. Chen, T., Cheng, M. M., Tan, P., Shamir, A., & Hu, S. M. (2009). Sketch2photo: Internet image montage. In ACM transactions on graphics (TOG) (Vol. 28). ACM.

  3. Costea, D., & Leordeanu, M. (2016). Aerial image geolocalization from recognition and matching of roads and intersections. arXiv preprint arXiv:1605.08323.

  4. Dardi, F., Cervelli, F., & Carrato, S. (2009). A texture based shoe retrieval system for shoe marks of real crime scenes. Image Analysis and Processing-ICIAP, 2009, 384–393.

    Google Scholar 

  5. De Chazal, P., Flynn, J., & Reilly, R. B. (2005). Automated processing of shoeprint images based on the Fourier transform for use in forensic science. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(3), 341–350.

    Article  Google Scholar 

  6. Divecha, M., & Newsam, S. (2016). Large-scale geolocalization of overhead imagery. In Proceedings of the 24th ACM SIGSPATIAL international conference on advances in geographic information systems, ACM.

  7. Fisher, R. B., & Oliver, P. (1995). Multi-variate cross-correlation and image matching. In Proceedings of the British Machine Vision Conference (BMVC).

  8. Geiss, S., Einax, J., & Danzer, K. (1991). Multivariate correlation analysis and its application in environmental analysis. Analytica Chimica Acta, 242, 5–9.

    Article  Google Scholar 

  9. Gueham, M., Bouridane, A., & Crookes, D. (2008). Automatic recognition of partial shoeprints using a correlation filter classifier. In International machine vision and image processing conference, 2008. IMVIP’08 (pp. 37–42).

  10. Hariharan, B., Malik, J., & Ramanan, D. (2012). Discriminative decorrelation for clustering and classification. Computer Vision-ECCV, 2012, 459–472.

    Google Scholar 

  11. Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A. (2017). Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition.

  12. Kong, B., Supancic, J. S., Ramanan, D., & Fowlkes, C. C. (2017). Cross-domain forensic shoeprint matching. In British Machine Vision Conference (BMVC).

  13. Kortylewski, A. (2017). Model-based image analysis for forensic shoe print recognition. Ph.D. thesis, University\_of\_Basel.

  14. Kortylewski, A., & Vetter, T. (2016). Probabilistic compositional active basis models for robust pattern recognition. In British machine vision conference.

  15. Kortylewski, A., Albrecht, T., & Vetter, T. (2014). Unsupervised footwear impression analysis and retrieval from crime scene data. In Asian conference on computer vision (pp. 644–658). Springer, New York.

  16. Lee, H. C., Ramotowski, R., & Gaensslen, R. (2001). Advances in fingerprint technology. Boca Raton, FL: CRC Press.

    Google Scholar 

  17. Li, F. F., Fergus, R., & Perona, P. (2006). One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(4), 594–611.

    Article  Google Scholar 

  18. Malisiewicz, T., Gupta, A., & Efros, A. A. (2011). Ensemble of exemplar-SVMs for object detection and beyond. In 2011 IEEE international conference on computer vision (ICCV) (pp. 89–96).

  19. Mardia, K. V., Kent, J. T., & Bibby, J. M. (1980). Multivariate analysis (probability and mathematical statistics). London: Academic Press.

    Google Scholar 

  20. Martin, N., & Maes, H. (1979). Multivariate analysis. London: Academic Press.

    Google Scholar 

  21. Parkhi, O. M., Vedaldi, A., & Zisserman, A. (2015). Deep face recognition. In BMVC (Vol. 1).

  22. Patil, P. M., & Kulkarni, J. V. (2009). Rotation and intensity invariant shoeprint matching using gabor transform with application to forensic science. Pattern Recognition, 42(7), 1308–1317.

    Article  Google Scholar 

  23. Pavlou, M., & Allinson, N. (2006). Automatic extraction and classification of footwear patterns. Intelligent Data Engineering and Automated Learning-IDEAL, 2006, 721–728.

    Article  Google Scholar 

  24. Popper Shaffer, J., & Gillo, M. W. (1974). A multivariate extension of the correlation ratio. Educational and Psychological Measurement, 34(3), 521–524.

    Article  Google Scholar 

  25. Radim Tyleček, R.Š. (2013). Spatial pattern templates for recognition of objects with regular structure. In Proceedings of the GCPR, Saarbrucken, Germany.

  26. Richetelli, N., Lee, M. C., Lasky, C. A., Gump, M. E., & Speir, J. A. (2017). Classification of footwear outsole patterns using fourier transform and local interest points. Forensic Science International, 275, 102–109.

    Article  Google Scholar 

  27. Russell, B. C., Sivic, J., Ponce, J., & Dessales, H. (2011). Automatic alignment of paintings and photographs depicting a 3D scene. In 2011 IEEE international conference on computer vision workshops (ICCV workshops) (pp. 545–552).

  28. Senlet, T., El-Gaaly, T., & Elgammal, A. (2014). Hierarchical semantic hashing: Visual localization from buildings on maps. In 2014 22nd international conference on pattern recognition (ICPR) (pp. 2990–2995).

  29. Sharif Razavian, A., Azizpour, H., Sullivan, J., & Carlsson, S. (2014). CNN features off-the-shelf: An astounding baseline for recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 806–813).

  30. Shrivastava, A., Malisiewicz, T., Gupta, A., & Efros, A. A. (2011). Data-driven visual similarity for cross-domain image matching. ACM Transactions on Graphics (ToG), 30(6), 154.

    Article  Google Scholar 

  31. Tang, Y., Srihari, S. N., Kasiviswanathan, H., & Corso, J. J. (2010). Footwear print retrieval system for real crime scene marks. In International workshop on computational forensics (pp. 88–100). Springer, New York.

  32. Wei, C. H., & Gwo, C. Y. (2014). Alignment of core point for shoeprint analysis and retrieval. In 2014 international conference on information science, electronics and electrical engineering (ISEEE) (Vol. 2, pp. 1069–1072).

  33. Xiao, T., Li, H., Ouyang, W., & Wang, X. (2016). Learning deep feature representations with domain guided dropout for person re-identification. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1249–1258).

  34. Yekutieli, Y., Shor, Y., Wiesner, S., & Tsach, T. (2012). Expert assisting computerized system for evaluating the degree of certainty in 2D shoeprints. Technical report, Technical Report, TP-3211, National Institute of Justice.

  35. Zagoruyko, S., & Komodakis, N. (2015). Learning to compare image patches via convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4353–4361).

  36. Zbontar, J., & LeCun, Y. (2015). Computing the stereo matching cost with a convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1592–1599).

  37. Zhang, L., & Allinson, N. (2005). Automatic shoeprint retrieval system for use in forensic investigations. In UK workshop on computational intelligence.

Download references


We thank Sarena Wiesner and Yaron Shor for providing access to their dataset. This work was partially funded by the Center for Statistics and Applications in Forensic Evidence (CSAFE) through NIST Cooperative Agreement #70NANB15H176.

Author information



Corresponding author

Correspondence to Bailey Kong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by Tae-Kyun Kim, Stefanos Zafeiriou, Ben Glocker and Stefan Leutenegger.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kong, B., Supanc̆ic̆, J., Ramanan, D. et al. Cross-Domain Image Matching with Deep Feature Maps. Int J Comput Vis 127, 1738–1750 (2019).

Download citation


  • Normalized cross-correlation
  • Similarity metric
  • Cross-domain image matching