Human action recognition: a framework of statistical weighted segmentation and rank correlation-based selection

  • Muhammad Sharif
  • Muhammad Attique KhanEmail author
  • Farooq Zahid
  • Jamal Hussain Shah
  • Tallha Akram
Theoretical advances


Human action recognition from a video sequence has received much attention lately in the field of computer vision due to its range of applications in surveillance, healthcare, smart homes, tele-immersion, to name but a few. However, it is still facing several challenges such as human variations, occlusion, change in illumination, complex background. In this article, we consider the problems related to multiple human detection and classification using novel statistical weighted segmentation and rank correlation-based feature selection approach. Initially, preprocessing is performed on a set of frames to remove existing noise and to make the foreground maximal differentiable compared to the background. A novel weighted segmentation method is also introduced for human extraction prior to feature extraction. Ternary features are exploited including color, shape, and texture, which are later combined using serial-based features fusion method. To avoid redundancy, rank correlation-based feature selection technique is employed, which acts as a feature optimizer and leads to improved classification accuracy. The proposed method is validated on six datasets including Weizmann, KTH, Muhavi, WVU, UCF sports, and MSR action and validated based on seven performance measures. A fair comparison with existing work is also provided which proves the significance of proposed compared to other techniques.


Action recognition Weighted segmentation Feature selection Rank correlation Weighted KNN 



The authors would like to thank HEC Startup Research Grant Program (SRGP) Pakistan (Project# 1307).

Author contributions

MS generated this idea and developed a classification design and also identified the sole application. MAK performed the simulations by developing different patches of code with full integration. He is also responsible for this complete write-up. Different accuracy criteria are finalized and also simulated by this author. MF and JHS have given a complete shape to this article and identified several issues and helped the primary authors to overcome all those shortcomings. TS is responsible for the final proofreading along with the technical support in the classification step due to her research major. AR provided technical support in different sections which include feature extraction and fusion along with the issues raised in the development of selection approach. All authors read and approved the final manuscript.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no competing interest.

Availability of data and material

Six publicly available datasets are used in this research for validation of proposed method including Weizmann, KTH, Muhavi, WVU, UCF sports, and MSR action.


  1. 1.
    Khan MA, Akram T, Sharif M, Javed MY, Muhammad N, Yasmin M (2018) An implementation of optimized framework for action classification using multilayers neural network on selected fused features. Pattern Anal Appl 1–21.
  2. 2.
    Sharif M, Khan MA, Akram T, Javed MY, Saba T, Rehman A (2017) A framework of human detection and action recognition based on uniform segmentation and combination of Euclidean distance and joint entropy-based features selection. EURASIP J Image Video Process 2017:89CrossRefGoogle Scholar
  3. 3.
    Sharif M, Khan MA, Faisal M, Yasmin M, Fernandes SL (2018) A framework for offline signature verification system: best features selection approach. Pattern Recognit Lett. Google Scholar
  4. 4.
    Khan MA, Sharif M, Javed MY, Akram T, Yasmin M, Saba T (2017) License number plate recognition system using entropy-based features selection approach with SVM. IET Image Process 12(2):200–209CrossRefGoogle Scholar
  5. 5.
    Ogale NA (2006) A survey of techniques for human detection from video. Surv Univ Md 125:19Google Scholar
  6. 6.
    Nguyen DT, Li W, Ogunbona PO (2016) Human detection from images and videos: a survey. Pattern Recogn 51:148–175CrossRefGoogle Scholar
  7. 7.
    Mishra MSK, Jtmcoe F, Bhagat K (2015) A survey on human motion detection and surveillance. Int J Adv Res Electron Commun Eng 4(4):1044–1048Google Scholar
  8. 8.
    Zhu F, Shao L, Xie J, Fang Y (2016) From handcrafted to learned representations for human action recognition: a survey. Image Vis Comput 55:42–52CrossRefGoogle Scholar
  9. 9.
    Dawn DD, Shaikh SH (2016) A comprehensive survey of human action recognition with spatio-temporal interest point (STIP) detector. Vis Comput 32:289–306CrossRefGoogle Scholar
  10. 10.
    Dhulekar P, Gandhe S, Chitte H, Pardeshi K, (2017) Human action recognition: an overview. In: Proceedings of the international conference on data engineering and communication technology, pp 481–488Google Scholar
  11. 11.
    Carmona JM, Climent J (2018) Human action recognition by means of subtensor projections and dense trajectories. Pattern Recogn 81:443–455CrossRefGoogle Scholar
  12. 12.
    Ai S, Lu T, Xiong Y (2018) Improved dense trajectories for action recognition based on random projection and Fisher vectors. In: MIPPR 2017: pattern recognition and computer vision, p 1060915Google Scholar
  13. 13.
    Ming Y, Wang G, Hong X (2017) Spatial-temporal texture features for 3D human activity recognition using laser-based RGB-D videos. KSII Trans Internet Inf Syst 11(3):1595–1613Google Scholar
  14. 14.
    Zhang B, Yang Y, Chen C, Yang L, Han J, Shao L (2017) Action recognition using 3D histograms of texture and a multi-class boosting classifier. IEEE Trans Image Process 26:4648–4660MathSciNetCrossRefGoogle Scholar
  15. 15.
    Jaouedi N, Boujnah N, Htiwich O, Bouhlel MS (2017) Human action recognition to human behavior analysis. In: 2017 International conference on information and digital technologies (IDT), pp 263–266Google Scholar
  16. 16.
    Yi Y, Wang H, Zhang B (2017) Learning correlations for human action recognition in videos. Multimed Tools Appl 76(18):18891–18913CrossRefGoogle Scholar
  17. 17.
    Ji X, Zhou L, Qin N, Li Y (2016) A simple and fast action recognition method based on adaboost algorithm. Int J Multimed Ubiq Eng 11:225–236CrossRefGoogle Scholar
  18. 18.
    Akilandasowmya G, Sathiya P, AnandhaKumar P (2015) Human action analysis using K-NN classifier. In: 2015 Seventh international conference on advanced computing (ICoAC), pp 1–7Google Scholar
  19. 19.
    Kamal S, Jalal A, Kim D (2016) Depth images-based human detection, tracking and activity recognition using spatiotemporal features and modified HMM. J Electr Eng Technol 11:1857–1862CrossRefGoogle Scholar
  20. 20.
    Uddin M, Kim J (2016) Human activity recognition using spatiotemporal 3-D body joint features with hidden Markov models. KSII Trans Internet Inf Syst 10(6):2767–2780Google Scholar
  21. 21.
    Xiao Q, Song R (2018) Action recognition based on hierarchical dynamic Bayesian network. Multimed Tools Appl 77(6):6955–6968CrossRefGoogle Scholar
  22. 22.
    Nasiri JA, Charkari NM, Mozafari K (2014) Energy-based model of least squares twin support vector machines for human action recognition. Sig Process 104:248–257CrossRefGoogle Scholar
  23. 23.
    Tong M, Tian W, Wang H, Wang F (2018) A compact discriminant hierarchical clustering approach for action recognition. Multimed Tools Appl 77(6):7539–7564CrossRefGoogle Scholar
  24. 24.
    Moussa MM, Hamayed E, Fayek MB, El Nemr HA (2015) An enhanced method for human action recognition. J Adv Res 6:163–169CrossRefGoogle Scholar
  25. 25.
    Hashemi SM, Rahmati M (2016) View-independent action recognition: a hybrid approach. Multimed Tools Appl 75:6755–6775CrossRefGoogle Scholar
  26. 26.
    Azary S (2014) Grassmann learning for recognition and classification. Rochester Institute of TechnologyGoogle Scholar
  27. 27.
    Chaaraoui AA, Climent-Pérez P, Flórez-Revuelta F (2013) Silhouette-based human action recognition using sequences of key poses. Pattern Recogn Lett 34:1799–1807CrossRefGoogle Scholar
  28. 28.
    Iosifidis A, Tefas A, Pitas I (2012) View-invariant action recognition based on artificial neural networks. IEEE Trans Neural Netw Learn Syst 23:412–424CrossRefGoogle Scholar
  29. 29.
    Horn BK, Schunck BG (1981) Determining optical flow. Artif Intell 17:185–203CrossRefGoogle Scholar
  30. 30.
    Guo Z, Zhang L, Zhang D (2010) A completed modeling of local binary pattern operator for texture classification. IEEE Trans Image Process 19:1657–1663MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Usha R, Perumal K (2014) Content based image retrieval using combined features of color and texture features with SVM classification. Int J Comput Sci Commun Netw 4:169–174Google Scholar
  32. 32.
    Hechenbichler K, Schliep K (2004) Weighted k-nearest-neighbor techniques and ordinal classificationGoogle Scholar
  33. 33.
    Bregonzio M, Gong S, Xiang T (2009) Recognising action as clouds of space-time interest points. In: IEEE conference on computer vision and pattern recognition, 2009 (CVPR 2009), pp 1948–1955Google Scholar
  34. 34.
    Kulathumani V, Kavi R, Ramagiri S (2011) WVU multi-view action recognition datasetGoogle Scholar
  35. 35.
    Kumar SS, John M (2016) Human activity recognition using optical flow based feature set. In: 2016 IEEE international Carnahan conference on security technology (ICCST), pp 1–5Google Scholar
  36. 36.
    Azary S (2014) Grassmann learning for recognition and classificationGoogle Scholar
  37. 37.
    Mahadeo NK, Papliński AP, Ray S (2012) Model-based pupil and iris localization. In: The 2012 International joint conference on neural networks (IJCNN), pp 1–7Google Scholar
  38. 38.
    Kushwaha AKS, Srivastava S, Srivastava R (2017) Multi-view human activity recognition based on silhouette and uniform rotation invariant local binary patterns. Multimed Syst 23(4):451–467CrossRefGoogle Scholar
  39. 39.
    Iosifidis A, Tefas A, Pitas I (2013) Multi-view human action recognition: a survey. In: 2013 Ninth international conference on intelligent information hiding and multimedia signal processing, pp 522–525Google Scholar
  40. 40.
    Singh S, Velastin SA, Ragheb H (2010) Muhavi: a multicamera human action video dataset for the evaluation of action recognition methods. In: 2010 Seventh IEEE international conference on advanced video and signal based surveillance (AVSS), pp 48–55Google Scholar
  41. 41.
    Cai J, Tang X, Zhang L, Feng G (2016) Learning zeroth class dictionary for human action recognition. In: 2016 IEEE international conference on image processing (ICIP), pp 4175–4179Google Scholar
  42. 42.
    Maity S, Bhattacharjee D, Chakrabarti A (2017) A novel approach for human action recognition from silhouette images. IETE J Res 63(2):160–171CrossRefGoogle Scholar
  43. 43.
    Wu X, Jia Y (2012) View-invariant action recognition using latent kernelized structural SVM. In: European conference on computer vision, pp 411–424Google Scholar
  44. 44.
    Eweiwi A, Cheema S, Thurau C, Bauckhage C (2011) Temporal key poses for human action recognition. In: 2011 IEEE international conference on computer vision workshops (ICCV Workshops), pp 1310–1317Google Scholar
  45. 45.
    Rodriguez MD, Ahmed J, Shah M (2008) Action mach a spatio-temporal maximum average correlation height filter for action recognition. In: IEEE conference on computer vision and pattern recognition, 2008 (CVPR 2008), pp 1–8Google Scholar
  46. 46.
    Avgerinakis K, Briassouli A, Kompatsiaris Y (2016) Activity detection using sequential statistical boundary detection (SSBD). Comput Vis Image Underst 144:46–61CrossRefGoogle Scholar
  47. 47.
    Elshourbagy M, Hemayed E, Fayek M (2016) Enhanced bag of words using multilevel k-means for human activity recognition. Egypt Inform J 17:227–237CrossRefGoogle Scholar
  48. 48.
    Zhen X, Shao L, Li X (2014) Action recognition by spatio-temporal oriented energies. Inf Sci 281:295–309CrossRefGoogle Scholar
  49. 49.
    Iosifidis A, Tefas A, Pitas I (2014) Discriminant bag of words based representation for human action recognition. Pattern Recogn Lett 49:185–192CrossRefGoogle Scholar
  50. 50.
    Wang L, Qiao Y, Tang X (2014) Latent hierarchical model of temporal structure for complex activity classification. IEEE Trans Image Process 23:810–822MathSciNetCrossRefzbMATHGoogle Scholar
  51. 51.
    Somasundaram G, Cherian A, Morellas V, Papanikolopoulos N (2014) Action recognition using global spatio-temporal features derived from sparse representations. Comput Vis Image Underst 123:1–13CrossRefGoogle Scholar
  52. 52.
    Soomro K, Zamir AR, Shah M (2012) UCF101: a dataset of 101 human actions classes from videos in the wild, arXiv preprint arXiv:1212.0402
  53. 53.
    Eltoukhy MM, Elhoseny M, Hosny KM, Singh AK (2018) Computer aided detection of mammographic mass using exact Gaussian–Hermite moments. J Ambient Intell Humaniz Comput 1–9.
  54. 54.
    Tu Z, Xie W, Qin Q, Poppe R, Veltkamp RC, Li B et al (2018) Multi-stream CNN: learning representations based on human-related regions for action recognition. Pattern Recogn 79:32–43CrossRefGoogle Scholar
  55. 55.
    Li J, Liu L, Zhou M, Yang J-J, Chen S, Liu H et al (2018) Feature selection and prediction of small-for-gestational-age infants. J Ambient Intell Humaniz Comput 1–15.
  56. 56.
    Li D, Qiu Z, Dai Q, Yao T, Mei T (2018) Recurrent tubelet proposal and recognition networks for action detection. In: Proceedings of the European conference on computer vision (ECCV), pp 303–318Google Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  • Muhammad Sharif
    • 1
  • Muhammad Attique Khan
    • 2
    Email author
  • Farooq Zahid
    • 1
  • Jamal Hussain Shah
    • 1
  • Tallha Akram
    • 3
  1. 1.Department of CSCOMSATS University IslamabadWah CanttPakistan
  2. 2.Department of Computer Science and EngineeringHITEC UniversityTaxilaPakistan
  3. 3.Department of Electrical and Computer EngineeringCOMSATS University IslamabadWah CanttPakistan

Personalised recommendations