We investigate the problem of automatically determining what type of shoe left an impression found at a crime scene. This recognition problem is made difficult by the variability in types of crime scene evidence (ranging from traces of dust or oil on hard surfaces to impressions made in soil) and the lack of comprehensive databases of shoe outsole tread patterns. We find that mid-level features extracted by pre-trained convolutional neural nets are surprisingly effective descriptors for this specialized domains. However, the choice of similarity measure for matching exemplars to a query image is essential to good performance. For matching multi-channel deep features, we propose the use of multi-channel normalized cross-correlation and analyze its effectiveness. Our proposed metric significantly improves performance in matching crime scene shoeprints to laboratory test impressions. We also show its effectiveness in other cross-domain image retrieval problems: matching facade images to segmentation labels and aerial photos to map images. Finally, we introduce a discriminatively trained variant and fine-tune our system through our proposed metric, obtaining state-of-the-art performance.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Pretrained model was obtained from http://www.vlfeat.org/matconvnet/models/imagenet-resnet-50-dag.mat.
Pretrained model was obtained from http://www.vlfeat.org/matconvnet/models/imagenet-googlenet-dag.mat.
Pretrained model was obtained from http://www.vlfeat.org/matconvnet/models/imagenet-vgg-verydeep-16.mat.
Our code is available at http://github.com/bkong/MCNCC.
Bodziak, W. J. (1999). Footwear impression evidence: Detection, recovery and examination. Boca Raton, FL: CRC Press.
Chen, T., Cheng, M. M., Tan, P., Shamir, A., & Hu, S. M. (2009). Sketch2photo: Internet image montage. In ACM transactions on graphics (TOG) (Vol. 28). ACM.
Costea, D., & Leordeanu, M. (2016). Aerial image geolocalization from recognition and matching of roads and intersections. arXiv preprint arXiv:1605.08323.
Dardi, F., Cervelli, F., & Carrato, S. (2009). A texture based shoe retrieval system for shoe marks of real crime scenes. Image Analysis and Processing-ICIAP, 2009, 384–393.
De Chazal, P., Flynn, J., & Reilly, R. B. (2005). Automated processing of shoeprint images based on the Fourier transform for use in forensic science. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(3), 341–350.
Divecha, M., & Newsam, S. (2016). Large-scale geolocalization of overhead imagery. In Proceedings of the 24th ACM SIGSPATIAL international conference on advances in geographic information systems, ACM.
Fisher, R. B., & Oliver, P. (1995). Multi-variate cross-correlation and image matching. In Proceedings of the British Machine Vision Conference (BMVC).
Geiss, S., Einax, J., & Danzer, K. (1991). Multivariate correlation analysis and its application in environmental analysis. Analytica Chimica Acta, 242, 5–9.
Gueham, M., Bouridane, A., & Crookes, D. (2008). Automatic recognition of partial shoeprints using a correlation filter classifier. In International machine vision and image processing conference, 2008. IMVIP’08 (pp. 37–42).
Hariharan, B., Malik, J., & Ramanan, D. (2012). Discriminative decorrelation for clustering and classification. Computer Vision-ECCV, 2012, 459–472.
Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A. (2017). Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Kong, B., Supancic, J. S., Ramanan, D., & Fowlkes, C. C. (2017). Cross-domain forensic shoeprint matching. In British Machine Vision Conference (BMVC).
Kortylewski, A. (2017). Model-based image analysis for forensic shoe print recognition. Ph.D. thesis, University\_of\_Basel.
Kortylewski, A., & Vetter, T. (2016). Probabilistic compositional active basis models for robust pattern recognition. In British machine vision conference.
Kortylewski, A., Albrecht, T., & Vetter, T. (2014). Unsupervised footwear impression analysis and retrieval from crime scene data. In Asian conference on computer vision (pp. 644–658). Springer, New York.
Lee, H. C., Ramotowski, R., & Gaensslen, R. (2001). Advances in fingerprint technology. Boca Raton, FL: CRC Press.
Li, F. F., Fergus, R., & Perona, P. (2006). One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(4), 594–611.
Malisiewicz, T., Gupta, A., & Efros, A. A. (2011). Ensemble of exemplar-SVMs for object detection and beyond. In 2011 IEEE international conference on computer vision (ICCV) (pp. 89–96).
Mardia, K. V., Kent, J. T., & Bibby, J. M. (1980). Multivariate analysis (probability and mathematical statistics). London: Academic Press.
Martin, N., & Maes, H. (1979). Multivariate analysis. London: Academic Press.
Parkhi, O. M., Vedaldi, A., & Zisserman, A. (2015). Deep face recognition. In BMVC (Vol. 1).
Patil, P. M., & Kulkarni, J. V. (2009). Rotation and intensity invariant shoeprint matching using gabor transform with application to forensic science. Pattern Recognition, 42(7), 1308–1317.
Pavlou, M., & Allinson, N. (2006). Automatic extraction and classification of footwear patterns. Intelligent Data Engineering and Automated Learning-IDEAL, 2006, 721–728.
Popper Shaffer, J., & Gillo, M. W. (1974). A multivariate extension of the correlation ratio. Educational and Psychological Measurement, 34(3), 521–524.
Radim Tyleček, R.Š. (2013). Spatial pattern templates for recognition of objects with regular structure. In Proceedings of the GCPR, Saarbrucken, Germany.
Richetelli, N., Lee, M. C., Lasky, C. A., Gump, M. E., & Speir, J. A. (2017). Classification of footwear outsole patterns using fourier transform and local interest points. Forensic Science International, 275, 102–109.
Russell, B. C., Sivic, J., Ponce, J., & Dessales, H. (2011). Automatic alignment of paintings and photographs depicting a 3D scene. In 2011 IEEE international conference on computer vision workshops (ICCV workshops) (pp. 545–552).
Senlet, T., El-Gaaly, T., & Elgammal, A. (2014). Hierarchical semantic hashing: Visual localization from buildings on maps. In 2014 22nd international conference on pattern recognition (ICPR) (pp. 2990–2995).
Sharif Razavian, A., Azizpour, H., Sullivan, J., & Carlsson, S. (2014). CNN features off-the-shelf: An astounding baseline for recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 806–813).
Shrivastava, A., Malisiewicz, T., Gupta, A., & Efros, A. A. (2011). Data-driven visual similarity for cross-domain image matching. ACM Transactions on Graphics (ToG), 30(6), 154.
Tang, Y., Srihari, S. N., Kasiviswanathan, H., & Corso, J. J. (2010). Footwear print retrieval system for real crime scene marks. In International workshop on computational forensics (pp. 88–100). Springer, New York.
Wei, C. H., & Gwo, C. Y. (2014). Alignment of core point for shoeprint analysis and retrieval. In 2014 international conference on information science, electronics and electrical engineering (ISEEE) (Vol. 2, pp. 1069–1072).
Xiao, T., Li, H., Ouyang, W., & Wang, X. (2016). Learning deep feature representations with domain guided dropout for person re-identification. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1249–1258).
Yekutieli, Y., Shor, Y., Wiesner, S., & Tsach, T. (2012). Expert assisting computerized system for evaluating the degree of certainty in 2D shoeprints. Technical report, Technical Report, TP-3211, National Institute of Justice.
Zagoruyko, S., & Komodakis, N. (2015). Learning to compare image patches via convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4353–4361).
Zbontar, J., & LeCun, Y. (2015). Computing the stereo matching cost with a convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1592–1599).
Zhang, L., & Allinson, N. (2005). Automatic shoeprint retrieval system for use in forensic investigations. In UK workshop on computational intelligence.
We thank Sarena Wiesner and Yaron Shor for providing access to their dataset. This work was partially funded by the Center for Statistics and Applications in Forensic Evidence (CSAFE) through NIST Cooperative Agreement #70NANB15H176.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Communicated by Tae-Kyun Kim, Stefanos Zafeiriou, Ben Glocker and Stefan Leutenegger.
About this article
Cite this article
Kong, B., Supanc̆ic̆, J., Ramanan, D. et al. Cross-Domain Image Matching with Deep Feature Maps. Int J Comput Vis 127, 1738–1750 (2019). https://doi.org/10.1007/s11263-018-01143-3
- Normalized cross-correlation
- Similarity metric
- Cross-domain image matching