Multimedia Tools and Applications

, Volume 78, Issue 6, pp 6513–6528 | Cite as

Art painting detection and identification based on deep learning and image local features

  • Yiyu Hong
  • Jongweon KimEmail author


Many art paintings are placed in film scenes or TV programs as decoration. To prevent using unauthorized copyrighted art paintings, we propose a method that combines a deep learning based object detector and hand-crafted image local features to identify copyrighted art paintings from images that contain them. The object detector is trained with our collected data to be able to detect art paintings. If a query image is input, the object detector will detect the art painting regions, then, the copyrighted art paintings can be identified by matching image local features between the art painting regions and the original copyrighted art paintings that have already been stored in advance. To test the ability of the proposed method from different aspects, we prepared four different kinds of test images: Famous, Monitor Easy, Monitor Hard, and Print. Finally, we provide a practicability analysis of our method based on the experimental results on these test images. Additionally, compared with Scale Invariant Feature Transform (SIFT), our approach outperformed by more than 20%.


Art painting detection Art painting identification Art painting dataset Image local feature Deep learning Machine learning Feature extraction 



This research is supported by Ministry of Culture, Sports and Tourism(MCST) and Korea Creative Content Agency(KOCCA) in the Culture Technology (CT) Research & Development Program 2017.


  1. 1.
    Alahi A, Ortiz R, Vandergheynst P (2012) Freak: fast retina keypoint. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. doi:
  2. 2.
    Bay H, Tuytelaars T, Van Gool L. (2006) Surf: Speeded up robust features. In ECCV. doi: CrossRefGoogle Scholar
  3. 3.
    Bosch A, Zisserman A, Munoz X (2007) Image classification using random forests and ferns. ICCV, pp. 1–8Google Scholar
  4. 4.
    Brown M, Lowe DG (2007) Automatic panoramic image stitching using invariant features. In. Int J Comput Vis 74:59–77. CrossRefGoogle Scholar
  5. 5.
    Cui J, Liu Y, Xu Y, Zhao H, Zha H (2013) Tracking generic human motion via fusion of low- and high-dimensional approaches. IEEE Trans Syst Man Cybern Syst 43(4):996–1002CrossRefGoogle Scholar
  6. 6.
    Dai J, Li Y, He K, Sun J. R-FCN: Object Detection via Region-based Fully Convolutional Networks. arXiv preprint arXiv:1605.06409Google Scholar
  7. 7.
    Deac A, van der Lubbe J, Backer E (2006) Feature selection for paintings classification by optimal tree pruning. in Multimedia Content Representation, Classification and Security. pp. 354–361CrossRefGoogle Scholar
  8. 8.
    Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: A Large-Scale Hierarchical Image Database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. doi:
  9. 9.
    Everingham M, Gool LV, Williams C, Winn J, Zisserman A (2010) The Pascal visual object classes (VOC). Challenge 88:303–338. Available online: (accessed on 13.09.2017)CrossRefGoogle Scholar
  10. 10.
    Famous Artworks Exhibition. Available online: (accessed on 13.09.2017)
  11. 11.
    Food Dataset. Available online: (accessed on 13.09.2017)
  12. 12.
    Girshick R (2015) Fast R-CNN. In International Conference on Computer Vision. doi:
  13. 13.
    Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. doi:
  14. 14.
    Jain AK, Duin RPW, Mao J (2000) Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 22:4–37. CrossRefGoogle Scholar
  15. 15.
    Keren D (2002) Painter identification using local features and naive bayes. in Pattern Recognition. 2002. Proceedings. 16th International Conference on. vol. 2, pp. 474–477Google Scholar
  16. 16.
    Kim K.-H, Hong S, Roh B, Cheon Y, Park M. Pvanet: Deep but lightweight neural networks for real-time object detection. arXiv preprint arXiv:1608.08021Google Scholar
  17. 17.
    Leutenegger S, Chli M, Siewart R. (2011) Brisk: Binary robust invariant scalable keypoints. In International Conference on Computer Vision. doi:
  18. 18.
    Li J, Wang J (2004) Studying digital imagery of ancient paintings by mixtures of stochastic models. Image Processing. IEEE Transactions on 13(3):340–353Google Scholar
  19. 19.
    Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollr P, Zitnick CL (2014) Microsoft COCO: Common objects in context. In ECCV. doi: Google Scholar
  20. 20.
    Lin T-Y, Dollar P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. doi: CrossRefGoogle Scholar
  21. 21.
    Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: Single Shot Multibox Detector. In ECCV. doi: CrossRefGoogle Scholar
  22. 22.
    Lombardi T, Cha S-H, Tappert C (2004) A graphical user interface for a fine-art painting image retrieval system. in MIR ‘04: Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval. ACM. pp. 107–112Google Scholar
  23. 23.
    Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–110. CrossRefGoogle Scholar
  24. 24.
    Mair E, Hager GD, Burschka D, Suppa M, Hirzinger G (2010) Adaptive and generic corner detection based on the accelerated segment test. In ECCV. doi: CrossRefGoogle Scholar
  25. 25.
    Martinel N, Micheloni C, Foresti GL (2013) Robust painting recognition and registration for mobile augmented reality. IEEE Signal Process Letter 20(11):1022–1025CrossRefGoogle Scholar
  26. 26.
    Mikolajczyk K, Schmid C (2005) A performance evaluation of local descriptors. IEEE Trans Pattern Anal Mach Intell 27:1615–1630. CrossRefGoogle Scholar
  27. 27.
    Miksik O, Mikolajczyk K (2012) Evaluation of local detectors and descriptors for fast feature matching. In International Conference on Pattern Recognition Google Scholar
  28. 28.
    Morel KM, Yu G (2009) ASIFT: a new framework for fully affine invariant image comparison. SIAM Journal on Imaging Sciences 2:438–469. MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Redmon J, Farhadi A. Yolo9000: Better, faster, stronger. arXiv preprint arXiv:1612.08242Google Scholar
  30. 30.
    Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. doi:
  31. 31.
    Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39:1137–1149. CrossRefGoogle Scholar
  32. 32.
    Ringgold v. Black Entm’t Television (1997) Inc., 126 F.3d 70 (2d Cir)Google Scholar
  33. 33.
    Rublee E, Rabaud V, Konolige K, Bradski G (2011) ORB: an efficient alternative to SIFT or SURF. In International Conference on Computer Vision. doi:
  34. 34.
    Shrivastava A, Gupta A (2016) Contextual Priming and Feedback for Faster R-CNN. In ECCV, 330–348. doi: CrossRefGoogle Scholar
  35. 35.
    Shrivastava A, Gupta A, Girshick R. Training Region-based Object Detectors with Online Hard Example Mining. arXiv preprint arXiv:1604.03540Google Scholar
  36. 36.
    Skrypnyk I, Lowe DG (2004) Scene Modelling, Recognition and Tracking with Invariant Image Features. In International symposium on mixed and augmented reality. doi:
  37. 37.
    Szeliski R (2006) Image alignment and stitching: a tutorial. Foundations and Trends in Computer Graphics and Vision 2:1–104. MathSciNetCrossRefzbMATHGoogle Scholar
  38. 38.
    Uijlings JRR, van de Sande KEA, Gevers T, Smeulders AWM (2013) Selective search for object recognition. Int J Comput Vis 104:154–171. CrossRefGoogle Scholar
  39. 39.
    Wikiart. Available online: (accessed on 13.09.2017)
  40. 40.
    Xu L, Oja E (1993) Randomized Hough transform (RHT): basic mechanisms, algorithms, and computational complexities. CVGIP: Image Understand 57(2):131–154CrossRefGoogle Scholar
  41. 41.
    Yang B, Yan J, Lei Z, Li SZ (2016) Craft Objects from Images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. doi:
  42. 42.
    Zagoruyko S, Lerer A, Lin T-Y, Pinheiro PO, Gross S, Chintala S, Dollar P (2016) A multipath network for object detection. In BMVC, doi:
  43. 43.
    Zhai A, Kislyuk D, Jing Y, Feng M, Tzeng E, Donahue J, Du YL, Darrell T (2017) Visual discovery at Pinterest. In Proceedings of the 26th International Conference on World Wide Web Companion. doi:

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Copyright ProtectionSangmyung UniversitySeoulSouth Korea
  2. 2.Department of Electronics EngineeringSangmyung, UniversitySeoulSouth Korea

Personalised recommendations