Multimedia Tools and Applications

, Volume 77, Issue 24, pp 32179–32211 | Cite as

Accumulative image categorization: a personal photo classification method for progressive collection

  • Jiagao Hu
  • Zhengxing SunEmail author
  • Yunhan Sun
  • Jinlong Shi


With the explosive growth of personal photos, an effective classification tool is becoming an urgent need to organize our progressive image collections. Facing the dynamically growing collections, we present a new method to categorize images effectively by integrating image clustering, incremental updating and user feedback together in an online framework. Considering the user burden and the user-specific preference during image classification, we propose several strategies to learn a customized classification model progressively for each user. Firstly, we use a multi-view learning method to learn the preferred classification perspective of the user. Secondly, we cluster similar images into groups according to user’s preference, so that images in a group can be categorized simultaneously with high efficiency. Thirdly, we propose a multi-centroid nearest class mean classifier to online learn the user’s preferred category granularity, and use it to classify the image groups. Unlike offline systems where pre-labeling and batch training often take hours or even days to perform, our approach is fully online. It can learn the classification model and classify newly acquired images alternately in no time. The sufficient experimental results and a user study demonstrate the effectiveness of the proposed method.


Image classification Online learning Image clustering Nearest class mean classifier Progressive collection 



This work is supported by National High Technology Research and Development Program of China (No. 2007AA01Z334); National Natural Science Foundation of China (No. 61321491, 61272219); Innovation Fund of State Key Laboratory for Novel Software Technology (No. ZZKT2013A12, ZZKT2016A11); Program for New Century Excellent Talents in University of China (No. NCET-04-04605); Nanjing University Innovation and Creative Program for PhD candidate (No. 2016013).


  1. 1.
    Akata Z, Perronnin F, Harchaoui Z, Schmid C (2014) Good practice in large-scale learning for image classification. IEEE Trans Pattern Anal Mach Intell 36 (3):507–520. CrossRefGoogle Scholar
  2. 2.
    Bergamo A, Torresani L, Fitzgibbon A (2011) PICODES: learning a compact code for novel-category recognition. In: Advances in neural information processing systems, pp 2088–2096Google Scholar
  3. 3.
    Biswas A, Jacobs D (2014) Active image clustering with pairwise constraints from humans. Int J Comput Vis 108(1-2):133–147. MathSciNetCrossRefGoogle Scholar
  4. 4.
    Bruneau P, Picarougne F, Gelgon M (2010) Interactive unsupervised classification and visualization for browsing an image collection. Pattern Recogn 43(2):485–493. CrossRefGoogle Scholar
  5. 5.
    Bul SR, Kontschieder P (2016) Online learning with bayesian classification trees. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 3985–3993.
  6. 6.
    Chechik G, Sharma V, Shalit U, Bengio S (2010) Large scale online learning of image similarity through ranking. J Mach Learn Res 11:1109–1135MathSciNetzbMATHGoogle Scholar
  7. 7.
    Cheng MM, Zhang Z, Lin WY, Torr P (2014) BING: binarized normed gradients for objectness estimation at 300fps. In: 2014 IEEE conference on computer vision and pattern recognition. IEEE, pp 3286–3293.
  8. 8.
    Ciocca G, Cusano C, Santini S, Schettini R (2014) On the use of supervised features for unsupervised image categorization: an evaluation. Comput Vis Image Underst 122:155–171. CrossRefGoogle Scholar
  9. 9.
    Crammer K, Dekel O, Keshet J (2006) Online Passive-Aggressive algorithms. J Mach Learn Res 7:551–585. MathSciNetzbMATHGoogle Scholar
  10. 10.
    Dang-Nguyen D, Piras L, Giacinto G, Boato G, Natale FGBD (2017) Multimodal retrieval with diversification and relevance feedback for tourist attraction images. ACM Trans Multimed Comput Commun Appl (TOMM) 13(4):49;1–49:24. Google Scholar
  11. 11.
    Datta P, Kibler DF (1997) Symbolic nearest mean classifiers. In: Proceedings of the fourteenth national conference on artificial intelligence and ninth innovative applications of artificial intelligence conference, AAAI-97, pp 82–87Google Scholar
  12. 12.
    Ebert S, Fritz M, Schiele B (2013) Semi-supervised learning on a budget: scaling up to large datasets. In: Computer vision — ACCV 2012, vol 7724, pp 232–245. CrossRefGoogle Scholar
  13. 13.
    Faktor A, Irani M (2014) Clustering by composition-unsupervised discovery of image categories. IEEE Trans Pattern Anal Mach Intell 36(6):1092–1106. CrossRefGoogle Scholar
  14. 14.
    Fei-Fei L, Fergus R, Perona P (2007) Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories. Comput Vis Image Underst 106(1):59–70. CrossRefGoogle Scholar
  15. 15.
    Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139. MathSciNetCrossRefGoogle Scholar
  16. 16.
    Fu Z, Ip HH, Lu H, Lu Z (2011) Multi-modal constraint propagation for heterogeneous image clustering. In: Proceedings of the 19th ACM international conference on multimedia - MM ’11, ACM Press, pp 143–152.
  17. 17.
    Galleguillos C, McFee B, Lanckriet GRG (2014) Iterative category discovery via multiple kernel metric learning. Int J Comput Vis 108(1-2):115–132. MathSciNetCrossRefGoogle Scholar
  18. 18.
    Girshick R, Donahue J, Darrell T, Malik J (2016) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158. CrossRefGoogle Scholar
  19. 19.
    Grzeszick R, Fink GA (2016) An iterative partitioning-based method for semi-supervised annotation learning in image collections. Int J Pattern Recognit Artif Intell 30(2):1–19. MathSciNetCrossRefGoogle Scholar
  20. 20.
    Guntuku SC, Zhou JT, Roy S, Lin W, Tsang IW (2016) Understanding deep representations learned in modeling users likes. IEEE Trans Image Process 25 (8):3762–3774. MathSciNetCrossRefGoogle Scholar
  21. 21.
    Hamid Amiri S, Jamzad M (2015) Efficient multi-modal fusion on supergraph for scalable image annotation. Pattern Recogn 48(7):2241–2253. CrossRefGoogle Scholar
  22. 22.
    Hoi SCH, Jin R, Zhao P, Yang T (2013) Online multiple Kernel classification. Mach Learn 90(2):289–316. MathSciNetCrossRefGoogle Scholar
  23. 23.
    Hu J, Sun Z, Li B, Wang S (2017) PicMarker: data-driven image categorization based on iterative clustering. In: Computer vision – ACCV 2016, 13th Asian conference on computer vision. Springer International Publishing, Taipei, pp 172–187. CrossRefGoogle Scholar
  24. 24.
    Hu J, Sun Z, Li B, Yang K, Li D (2017) Online user modeling for interactive streaming image classification. In: MultiMedia modeling - 23nd international conference, MMM 2017. Springer International Publishing, Reykjavik, Iceland, pp 293–305. Google Scholar
  25. 25.
    Huang Y, Wu Z, Wang L, Tan T (2014) Feature coding in image classification: a comprehensive study. IEEE Trans Pattern Anal Mach Intell 36 (3):493–506. CrossRefGoogle Scholar
  26. 26.
    Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM international conference on multimedia - MM ’14. ACM Press, pp 675–678.
  27. 27.
    Jin R, Hoi SCH, Yang T (2010) Online multiple kernel learning: algorithms and mistake bounds. Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6331 LNAI:390–404. MathSciNetzbMATHGoogle Scholar
  28. 28.
    Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1–9Google Scholar
  29. 29.
    Kundu MK, Chowdhury M, Rota Bulȯ RS (2015) A graph-based relevance feedback mechanism in content-based image retrieval. Knowl-Based Syst 73:254–264. CrossRefGoogle Scholar
  30. 30.
    Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 2169–2178.
  31. 31.
    Lee YJ, Grauman K (2011) Learning the easy things first: self-paced visual category discovery. In: 2011 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1721–1728.
  32. 32.
    Li LJ, Li F-F (2007) What, where and who? Classifying events by scene and object recognition. In: IEEE international conference on computer vision, IEEE, pp 1–8.
  33. 33.
    Li G, Huang Q, Jiang S, Xu Y, Zhang W (2015) Online learning affinity measure with CovBoost for multi-target tracking. Neurocomputing 168:327–335. CrossRefGoogle Scholar
  34. 34.
    Li Z, Tang J (2015) Unsupervised feature selection via nonnegative spectral analysis and redundancy control. IEEE Trans Image Process 24(12):5343–5355. MathSciNetCrossRefGoogle Scholar
  35. 35.
    Li Z, Tang J (2015) Weakly supervised deep metric learning for community-contributed image retrieval. IEEE Trans Multimed 17(11):1989–1999. CrossRefGoogle Scholar
  36. 36.
    Li X, Uricchio T, Ballan L, Bertini M, Snoek CGM, Bimbo AD (2016) Socializing the semantic gap: a comparative survey on image tag assignment, refinement, and retrieval. ACM Comput Surv 49(1):1–39. CrossRefGoogle Scholar
  37. 37.
    Li Z, Tang J (2017) Weakly supervised deep matrix factorization for social image understanding. IEEE Trans Image Process 26(1):276–288. MathSciNetCrossRefGoogle Scholar
  38. 38.
    Lin L, Wang K, Meng D, Zuo W, Zhang L (2018) Active self-paced learning for cost-effective and progressive face identification. IEEE Trans Pattern Anal Mach Intell 40(1):7–19. CrossRefGoogle Scholar
  39. 39.
    Liu Y, Zhang L, Nie L, Yan Y, Rosenblum DS (2016) Fortune teller: predicting your career path. In: Proceedings of the thirtieth AAAI conference on artificial intelligence (AAAI). AAAI Press, pp 201–207Google Scholar
  40. 40.
    Liu Y, Zheng Y, Liang Y, Liu S, Rosenblum DS (2016) Urban water quality prediction based on multi-task multi-view learning. In: Proceedings of the twenty-fifth international joint conference on artificial intelligence, IJCAI 2016. IJCAI/AAAI Press, pp 2576–2581Google Scholar
  41. 41.
    Liu P, Guo J, Chamnongthai K, Prasetyo H (2017) Fusion of color histogram and lbp-based features for texture image retrieval and classification. Inf Sci 390:95–111. CrossRefGoogle Scholar
  42. 42.
    Liu L, Fieguth P, Guo Y, Wang X, Pietikȧinen M (2017) Local binary features for texture classification: Taxonomy and experimental study. Pattern Recogn 62:135–160. CrossRefGoogle Scholar
  43. 43.
    Lovato P, Bicego M, Segalin C, Perina A, Sebe N, Cristani M (2014) Faved! Biometrics: tell me which image you like and i’ll tell you who you are. IEEE Trans Inf Forensic Secur 9(3):364–374. CrossRefGoogle Scholar
  44. 44.
    Lu Z, Ip HHS (2010) Constrained spectral clustering via exhaustive and efficient constraint propagation. In: Computer vision — ECCV 2010, pp 1–14. Google Scholar
  45. 45.
    Lughofer E, Pratama M (2018) Online active learning in data stream regression using uncertainty sampling based on evolving generalized fuzzy models. IEEE Trans Fuzzy Syst 26(1):292–309. CrossRefGoogle Scholar
  46. 46.
    Mensink T, Verbeek J, Perronnin F, Csurka G (2013) Distance-based image classification: generalizing to new classes at near-zero cost. IEEE Trans Pattern Anal Mach Intell 35(11):2624–2637. CrossRefGoogle Scholar
  47. 47.
    Misra I, Zitnick CL, Mitchell M, Girshick R (2016) Seeing through the human reporting bias: visual classifiers from noisy human-centric labels. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 2930–2939.
  48. 48.
    Oliva A, Hospital W, Ave L, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175. CrossRefGoogle Scholar
  49. 49.
    Rasiwasia N, Vasconcelos N (2013) Latent dirichlet allocation models for image classification. IEEE Trans Pattern Anal Mach Intell 35(11):2665–2679. CrossRefGoogle Scholar
  50. 50.
    Ristin M, Guillaumin M, Gall J, Van Gool L (2016) Incremental learning of random forests for large-scale image classification. IEEE Trans Pattern Anal Mach Intell 38(3):490–503. CrossRefGoogle Scholar
  51. 51.
    Royer A, Lampert CH (2015) Classifier adaptation at prediction time. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1401–1409.
  52. 52.
    Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252. MathSciNetCrossRefGoogle Scholar
  53. 53.
    Saffari A, Godec M, Pock T, Leistner C, Bischof H (2010) Online multi-class LPBoost. In: 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 3570–3577.
  54. 54.
    Shalev-Shwartz S (2011) Online learning and online convex optimization. Found Trends®; Mach Learn 4(2):107–194. CrossRefGoogle Scholar
  55. 55.
    Shi Z, Yang Y, Hospedales TM, Xiang T (2017) Weakly-supervised image annotation and segmentation with objects and attributes. IEEE Trans Pattern Anal Mach Intell 39(12):2525–2538. CrossRefGoogle Scholar
  56. 56.
    Shneiderman B, Kang H (2000) Direct annotation: a drag-and-drop strategy for labeling photos. In: 2000 IEEE conference on information visualization. IEEE Comput. Soc, pp 88–95Google Scholar
  57. 57.
    Song M, Sun Z, Liu K, Lang X (2015) Iterative 3D shape classification by online metric learning. Comput Aided Geom Des 35-36:192–205. MathSciNetCrossRefGoogle Scholar
  58. 58.
    Su Y, Jurie F (2012) Improving image classification using semantic attributes. Int J Comput Vis 100(1):59–77. CrossRefGoogle Scholar
  59. 59.
    Su Y, Jurie F (2012) Learning compact visual attributes for large-scale image classification. In: Computer vision — ECCV 2012. Workshops and Demonstrations, pp 51–60. CrossRefGoogle Scholar
  60. 60.
    Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1–9.
  61. 61.
    Torresani L, Szummer M, Fitzgibbon A (2010) Efficient object category recognition using classemes. In: Computer vision — ECCV 2010, pp 776–789. CrossRefGoogle Scholar
  62. 62.
    von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416. MathSciNetCrossRefGoogle Scholar
  63. 63.
    Wan J, Wu P, Hoi SCH, Zhao P, Gao X, Wang D, Zhang Y, Li J (2015) Online learning to rank for Content-Based image retrieval. In: Proceedings of the twenty-fourth international joint conference on artificial intelligence, IJCAI, vol 2015, pp 2284–2290Google Scholar
  64. 64.
    Wu J, Zhao S, Sheng VS, Zhang J, Ye C, Zhao P, Cui Z (2017) Weak-labeled active learning with conditional label dependence for multilabel image classification. IEEE Trans Multimed 19(6):1156–1169. CrossRefGoogle Scholar
  65. 65.
    Xing EP, Ng AY, Jordan MI, Russell S (2002) Distance metric learning with application to clustering with side-information. Adv Neural Inf Process Syst 15:505–512Google Scholar
  66. 66.
    Yuan Y, Jiang Z, Wang Q (2015) Video-based road detection via online structural learning. Neurocomputing 168:336–347. CrossRefGoogle Scholar
  67. 67.
    Zhang H, Zha ZJ, Yang Y, Yan S, Gao Y, Chua TS (2014) Attribute-augmented semantic hierarchy: towards a unified framework for content-based image retrieval. ACM Trans Multimed Comput Commun Appl 11:1–21. CrossRefGoogle Scholar
  68. 68.
    Zhang F, Sun Z, Song M, Lang X (2015) Progressive 3D shape segmentation using online learning. Comput Aided Des 58:2–12. CrossRefGoogle Scholar
  69. 69.
    Zhang J, Han Y, Jiang J (2017) Semi-supervised tensor learning for image classification. Multimed Syst 23(1):63–73. CrossRefGoogle Scholar
  70. 70.
    Zhu S, Sun X, Jin D (2016) Multi-view semi-supervised learning for image classification. Neurocomputing 208:136–142. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.State Key Lab for Novel Software TechnologyNanjing UniversityNanjingChina
  2. 2.School of Computer Science and EngineeringJiangsu University of Science and TechnologyZhenjiangChina

Personalised recommendations