Multimedia Tools and Applications

, Volume 78, Issue 6, pp 6721–6744 | Cite as

Group sparse based locality – sensitive dictionary learning for video semantic analysis

  • Ben-Bright BenuwaEmail author
  • Yongzhao Zhan
  • JunQi Liu
  • Jianping Gou
  • Benjamin Ghansah
  • Ernest K. Ansah


Sparse Representation-based Classifier (SRC) and Dictionary Learning (DL), have significantly impacted greatly on the classification performance of image recognition in recent times. In video semantic analysis, the locality structure of video semantic data containing more discriminative information is very essential for classification. However, this has not been fully considered by the current sparse representation-based approaches. Furthermore, similar coding outcomes are not being realized from video features with the same video category. To handle these issues, we propose a novel DL method, called Group Sparsity Locality-Sensitive Dictionary Learning (GSLSDL) for video semantic analysis. In the proposed GSLSDL, a discriminant loss function for the video category based on group sparse coding of sparse coefficients, is introduced into the structure of the Locality-Sensitive Dictionary Learning (LSDL) method. After solving the optimized dictionary, the sparse coefficients for the testing video feature samples are obtained. The classification result for video semantic is then realized by minimizing the error between the original and reconstructed samples. The experiment results show that, the proposed GSLSDL significantly improves the performance of video semantic detection compared with the competing methods, and robust in various diverse environments of video.


Group sparsity Sparse representation Locality information Dictionary learning Video semantic analysis 



This work was buoyed in part by National Natural Science Foundation of China (Grant Nos.~61170126, Grant Nos.~61502208), the Natural Science Foundation of the Jiangsu Higher Education Institutions of China (Grant No. 14KJB520007), China Postdoctoral Science Foundation (Grant No. 2015 M570411), Natural Science Foundation of Jiangsu Province of China (Grant No. BK20150522) and Research Foundation for Talented Scholars of JiangSu University (Grant No. 14JDG037).

Compliance with ethical standards

Conflict of Interest

The authors declare that, there are no conflicts of interest whatsoever.


  1. 1.
    Aharon M, Elad M, Bruckstein A (2006) K-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation. Signal Processing, IEEE Transactions on 54:4311–4322Google Scholar
  2. 2.
    Benuwa BB, Ghansah B, Wornyo DK, Adabunu SA (2016) A Comprehensive Review of Particle Swarm Optimization. In: International Journal of Engineering Research in Africa, pp. 141–161Google Scholar
  3. 3.
    Benuwa BB, Zhan YZ, Ghansah B, Wornyo DK, Banaseka Kataka F (2016) A Review of Deep Machine Learning. In: International Journal of Engineering Research in Africa, pp. 124–136Google Scholar
  4. 4.
    Cai S, Zuo W, Zhang L, Feng X, Wang P (2014) Support vector guided dictionary learning. In: European Conference on Computer Vision, pp. 624–639Google Scholar
  5. 5.
    Chang H, Yang M, Yang J (2016) Learning a structure adaptive dictionary for sparse representation based classification. Neurocomputing 190:124–131CrossRefGoogle Scholar
  6. 6.
    Donoho DL, Elad M (2003) Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization. Proc Natl Acad Sci 100:2197–2202MathSciNetCrossRefGoogle Scholar
  7. 7.
    Engan K, Aase SO, Husoy JH (1999) Method of optimal directions for frame design. In: Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on, pp. 2443–2446Google Scholar
  8. 8.
    Feng Z, Yang M, Zhang L, Liu Y, Zhang D (2013) Joint discriminative dimensionality reduction and dictionary learning for face recognition. Pattern Recogn 46:2134–2143CrossRefGoogle Scholar
  9. 9.
    Gao L, Guo Z, Zhang H, Xu X, Shen HT (2017) Video Captioning With Attention-Based LSTM and Semantic Consistency. IEEE Transactions on Multimedia 19:2045–2055CrossRefGoogle Scholar
  10. 10.
    Gou J, Xu Y, Zhang D, Mao Q, Du L, Zhan Y (2018) Two-phase linear reconstruction measure-based classification for face recognition. Inf Sci 433–434:17–36MathSciNetCrossRefGoogle Scholar
  11. 11.
    Guo Y, Zhang J, Gao L (2018) Exploiting long-term temporal dynamics for video captioning. World Wide Web-internet & Web Information Systems 1–15Google Scholar
  12. 12.
    Haralick RM, Shanmugam K (1973) Textural features for image classification. IEEE Transactions on Systems, Man, and Cybernetics, pp. 610–621Google Scholar
  13. 13.
    Harandi M, Salzmann M (2015) Riemannian coding and dictionary learning: Kernels to the rescue," In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3926–3935Google Scholar
  14. 14.
    Jiang Z, Lin Z, Davis LS (2011) Learning a discriminative dictionary for sparse coding via label consistent K-SVD. In: Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pp. 1697–1704Google Scholar
  15. 15.
    Jiang Z, Lin Z, Davis LS (2013) Label consistent K-SVD: Learning a discriminative dictionary for recognition. IEEE Trans Pattern Anal Mach Intell 35:2651–2664CrossRefGoogle Scholar
  16. 16.
    Lee Y-S, Wang C-Y, Mathulaprangsan S, Zhao J-H, Wang J-C (2016) Locality-preserving K-SVD Based Joint Dictionary and Classifier Learning for Object Recognition. In: Proceedings of the 2016 ACM on Multimedia Conference, pp. 481–485Google Scholar
  17. 17.
    Lei J, Zheng K, Zhang H, Cao X, Ling N, Hou Y (2017) Sketch based image retrieval via image-aided cross domain learning. In: Image Processing (ICIP), 2017 IEEE International Conference on, pp. 3685–3689Google Scholar
  18. 18.
    Li L, Li S, Fu Y (2013) Discriminative dictionary learning with low-rank regularization for face recognition. In: Automatic Face and Gesture Recognition (FG), 2013 10th IEEE International Conference and Workshops on, pp. 1–6Google Scholar
  19. 19.
    Liu W, Yu Z, Yang M, Lu L, Zou Y (2015) Joint kernel dictionary and classifier learning for sparse coding via locality preserving K-SVD. In. IEEE International Conference on Multimedia and Expo, pp. 1–6Google Scholar
  20. 20.
    Ma H, Gou J, Wang X, Ke J, Zeng S (2017) Sparse Coefficient-Based ${k}$ -Nearest Neighbor Classification. IEEE Access 5:16618–16634CrossRefGoogle Scholar
  21. 21.
    Mairal J, Bach F, Ponce J, Sapiro G, Zisserman A (2008) Discriminative learned dictionaries for local image analysis. In: Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pp. 1–8Google Scholar
  22. 22.
    Mairal J, Ponce J, Sapiro G, Zisserman A, Bach FR (2009) Supervised dictionary learning. In: Advances in neural information processing systems, pp. 1033–1040Google Scholar
  23. 23.
    Mukundan R (2005) Radial Tchebichef invariants for pattern recognition. In: TENCON 2005 2005 IEEE Region 10, pp. 1–6Google Scholar
  24. 24.
    Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24:971–987CrossRefGoogle Scholar
  25. 25.
    Pham D-S, Venkatesh S (2008) Joint learning and dictionary construction for pattern recognition," in Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pp. 1–8Google Scholar
  26. 26.
    Song J, Gao L, Nie F, Shen HT, Yan Y, Sebe N (2016) Optimized graph learning using partial tags and multiple features for image and video annotation. IEEE Trans Image Process 25:4999–5011MathSciNetCrossRefGoogle Scholar
  27. 27.
    Song J, He T, Fan H, Gao L (2017) Deep Discrete Hashing with Self-supervised Pairwise Labels. 223–238Google Scholar
  28. 28.
    Song J, Zhang H, Li X, Gao L, Wang M, Hong R (2018) Self-Supervised Video Hashing with Hierarchical Binary Auto-encoder. IEEE Trans Image Process PP:1–1MathSciNetzbMATHGoogle Scholar
  29. 29.
    Sun Y, Liu Q, Tang J, Tao D (2014) Learning discriminative dictionary for group sparse representation. IEEE Trans Image Process 23:3816–3828MathSciNetCrossRefGoogle Scholar
  30. 30.
    Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pp. 3360–3367Google Scholar
  31. 31.
    Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained Linear Coding for image classification. In: Computer Vision and Pattern Recognition, 3360–3367Google Scholar
  32. 32.
    Wang P, Lan J, Zang Y, Song Z (2016) Discriminative structured dictionary learning for image classification. Transactions of Tianjin University 22:158–163CrossRefGoogle Scholar
  33. 33.
    Wang X, Gao L, Wang P, Sun X, Liu X (2017) Two-stream 3D convNet Fusion for Action Recognition in Videos with Arbitrary Size and Length. IEEE Transactions on Multimedia PP:1–1Google Scholar
  34. 34.
    Wang X, Gao L, Song J, Zhen X, Sebe N, Shen HT (2018) Deep Appearance and Motion Learning for Egocentric Activity Recognition. NeurocomputingGoogle Scholar
  35. 35.
    Wei C-P, Chao Y-W, Yeh Y-R, Wang Y-CF (2013) Locality-sensitive dictionary learning for sparse representation based classification. Pattern Recogn 46:1277–1287CrossRefGoogle Scholar
  36. 36.
    Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust face recognition via sparse representation. Pattern Analysis and Machine Intelligence, IEEE Transactions on 31:210–227CrossRefGoogle Scholar
  37. 37.
    Xu Y, Sun Y, Quan Y, Zheng B (2015) Discriminative structured dictionary learning with hierarchical group sparsity. Comput Vis Image Underst 136:59–68CrossRefGoogle Scholar
  38. 38.
    Xu D, Alameda-Pineda X, Song J, Ricci E, Sebe N (2016) Academic Coupled Dictionary Learning for Sketch-based Image Retrieval. In: ACM on Multimedia Conference, pp. 1326–1335Google Scholar
  39. 39.
    Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification," in Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pp. 1794–1801Google Scholar
  40. 40.
    Yang M., L. Zhang, X. Feng, and D. Zhang (2011) Fisher discrimination dictionary learning for sparse representation. In: Computer Vision (ICCV), 2011 IEEE International Conference on, pp. 543–550Google Scholar
  41. 41.
    Yang M, Zhang L, Feng X, Zhang D (2014) Sparse representation based fisher discrimination dictionary learning for image classification. Int J Comput Vis 109:209–232MathSciNetCrossRefGoogle Scholar
  42. 42.
    Yongzhao Z, Manrong W, Jia K (2012) Video keyframe extraction using ordered samples clustering based on artificial immune. Journal of Jiangsu University (Natural Science Edition) 2:017Google Scholar
  43. 43.
    Yu K, Zhang T, Gong Y (2009) Nonlinear learning using local coordinate coding. In: Advances in neural information processing systems, 2223–2231Google Scholar
  44. 44.
    Yuan XT, Liu X, Yan S (2012) Visual Classification With Multitask Joint Sparse Representation. IEEE Trans Image Process 21:4349–4360MathSciNetCrossRefGoogle Scholar
  45. 45.
    Zha Z, Liu X, Huang X, Hong X, Shi H, Xu Y et al (2016) Analyzing the group sparsity based on the rank minimization methods. arXiv preprint arXiv:1611.08983Google Scholar
  46. 46.
    Zhan Y, Sun J, Niu D, Mao Q, Fan J (2015) A semi-supervised incremental learning method based on adaptive probabilistic hypergraph for video semantic detection. Multimedia Tools & Applications 74:5513–5531CrossRefGoogle Scholar
  47. 47.
    Zhan Y, Liu J, Gou J, Wang M (2016) A video semantic detection method based on locality-sensitive discriminant sparse representation and weighted KNN. J Vis Commun Image Represent 41:65–73CrossRefGoogle Scholar
  48. 48.
    Zhang Q, Li B (2010) Discriminative K-SVD for dictionary learning in face recognition. In: Computer Vision and Pattern Recognition, pp. 2691–2698Google Scholar
  49. 49.
    Zhang Z, Xu Y, Yang J, Li X, Zhang D (2015) A survey of sparse representation: algorithms and applications. IEEE Access 3:490–530CrossRefGoogle Scholar
  50. 50.
    Zhang Z, Li F, Chow TWS, Zhang L, Yan S (2016) Sparse Codes Auto-Extractor for Classification: A Joint Embedding and Dictionary Learning Framework for Representation. IEEE Trans Signal Process 64:3790–3805MathSciNetCrossRefGoogle Scholar
  51. 51.
    Zhang Z, Jiang W, Qin J, Zhang L, Li F, Zhang M et al (2017) Jointly Learning Structured Analysis Discriminative Dictionary and Analysis Multiclass Classifier. IEEE Trans Neural Netw Learn Syst PP:1–17CrossRefGoogle Scholar
  52. 52.
    Zhang T, Jia W, He X, Yang J (2017) Discriminative Dictionary Learning with Motion Weber Local Descriptor for Violence Detection. IEEE Transactions on Circuits & Systems for Video Technology 27:696–709CrossRefGoogle Scholar
  53. 53.
    Zhang Z, Jiang W, Li F, Zhao M, Li B, Zhang L (2017) Structured Latent Label Consistent Dictionary Learning for Salient Machine Faults Representation based Robust Classification. IEEE Transactions on Industrial Informatics PP:1–1Google Scholar
  54. 54.
    Zhao S, Yao H, Jiang X, Sun X (2015) Predicting discrete probability distribution of image emotions. In: IEEE International Conference on Image Processing, pp. 2459–2463Google Scholar
  55. 55.
    Zhao S, Yao H, Gao Y, Ji R, Ding G (2017) Continuous Probability Distribution Prediction of Image Emotions via Multitask Shared Sparse Regression. IEEE Transactions on Multimedia 19:632–645CrossRefGoogle Scholar
  56. 56.
    Zheng H, Tao D (2015) Discriminative dictionary learning via Fisher discrimination K-SVD algorithm: Elsevier Science Publishers B. VGoogle Scholar
  57. 57.
    Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67:301–320MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Computer Science and Communication EngineeringJiangsu UniversityZhenjiangChina
  2. 2.School of Computer ScienceData Link InstituteTemaGhana

Personalised recommendations