Multimedia Tools and Applications

, Volume 77, Issue 16, pp 21053–21082 | Cite as

Foveation-based content adaptive root mean squared error for video quality assessment

  • Mario VranješEmail author
  • Snježana Rimac-Drlje
  • Denis Vranješ


When the video is compressed and transmitted over heterogeneous networks, it is necessary to ensure the satisfying quality for the end user. Since human observers are the end users of video applications, it is very important that the human visual system (HVS) characteristics are taken into account during the video quality evaluation. This paper deals with video quality assessment (VQA) based on HVS characteristics and proposes a novel full-reference (FR) VQA metric called the Foveation-based content Adaptive Root Mean Squared Error (FARMSE). FARMSE uses several HVS characteristics that significantly influence perception of distortions in a video. Primarily these are foveated vision, reduction of the spatial acuity due to motions as well as spatial masking. Foveated vision is related to variable resolution of HVS across the viewing field, where the highest resolution is at the point of fixation. The point of fixation is projected onto the fovea – the area of retina with the highest density of photoreceptors. The part of image that falls on fovea is perceived by the highest acuity, whereas the spatial acuity decreases as the distance of the image part from the fovea increases. Spatial acuity further decreases if eyes cannot track moving objects. Both mentioned mechanisms influence contrast sensitivity of the HVS. Contrast sensitivity is frequency dependent and FARMSE uses Haar filters to utilize this dependence. Furthermore, spatial masking is implemented in each frequency channel. The FARMSE performance is compared to this of nine state-of-the-art VQA metrics on two different databases, LIVE and ECVQ. Additionally, the metrics are compared in terms of calculation complexity. The performed experiments show that FARMSE achieves high performance when predicting the quality of videos with different resolutions, degradation types and content types. FARMSE results outperform the results of most of the analyzed metrics, whereas they are comparable to these of the best publicly available metrics, including the well-known MOtion-based Video Integrity Evaluation (MOVIE) index. Besides that, FARMSE calculation complexity is significantly lower than that of the metrics comparable thereto in terms of prediction accuracy.


FARMSE Foveated vision Human visual system Spatio-temporal activity Video quality assessment 



This work was supported by the J.J. Strossmayer University of Osijek business fund through the internal competition for the research and artistic projects „IZIP-2016“ (project title: “Providing of digital video signal based services in rural and rarely populated areas”).


  1. 1.
    Bae SH, Kim M (2016) DCT-QM: A DCT-based quality degradation metric for image quality optimization problems. IEEE Trans Image Process 25(10):4916–4930MathSciNetCrossRefGoogle Scholar
  2. 2.
    Barten PGJ (1999) Contrast Sensitivity of the human eye and its effect on image quality. SPIE Publications, WashingtonCrossRefGoogle Scholar
  3. 3.
    2Bhat A, Kannangara S, Zhao Y, Richardson I (2012) A full-reference quality metric for compressed video based on mean squared error and video content. IEEE Trans Circuits Syst Video Technol 22(2):165–173Google Scholar
  4. 4.
  5. 5.
    Boccignone G, Marcelli A, Napoletano P, Di Fiore G, Iacovoni G, Morsa S (2008) Bayesian integration of face and low-level cues for foveated video coding. IEEE Trans Circuits Syst Video Technol 18(12):1727–1739CrossRefGoogle Scholar
  6. 6.
    Brandao T, Queluz MP (2010) No-reference quality assessment of H.264/AVC encoded video. IEEE Trans. Circuits Syst Video Technol 20(11):1437–1447CrossRefGoogle Scholar
  7. 7.
    Breitmeyer BG, Ogmen H (2000) Recent models and findings in visual backward masking: a comparison, review and update. Percept Psychophys 62(8):1572–1595CrossRefGoogle Scholar
  8. 8.
    Chandler DM, Hemami SS (2007) VSNR; A wavelet based visual signal-to-noise-ratio for nature images. IEEE Trans Image Process 16(9):2284–2297MathSciNetCrossRefGoogle Scholar
  9. 9.
    Chandler DM, Hemami SS (2007). VSNR: A Wavelet-Based Visual Signal-to-Noise Ratio for Natural Images (C++ and MATLAB implementations).
  10. 10.
    Chen Z, Liao N, Gu X, Wu F, Shi G (2016) Hybrid distortion ranking tuned bitstream-layer video quality assessment. IEEE Trans Circuits Syst Video Technol 26(6):1029–1043CrossRefGoogle Scholar
  11. 11.
    Chikkerur S, Sundaram V, Reisslein M, Karam LJ (2011) Objective video quality assessment: a classification, review, and performance comparison. IEEE Trans Broadcast 57(2):165–182CrossRefGoogle Scholar
  12. 12.
    Ciubotaru B, Muntean GM, Ghinea G (2009) Objective assessment of region of interest-aware adaptive multimedia streaming quality. IEEE Trans Broadcast 55(2):202–212CrossRefGoogle Scholar
  13. 13.
    Ciubotaru B, Ghinea G, Muntean GM (2014) Subjective assessment of region of interest-aware adaptive multimedia streaming quality. IEEE Trans Broadcast 60(1):50–60CrossRefGoogle Scholar
  14. 14.
    Daly S (1998) Engineering observations from spatiovelocity and spatiotemporal visual models. Proc SPIE 3299:180–191CrossRefGoogle Scholar
  15. 15.
    Eckert MP, Buchsbaum G (1993) The significance of eye movements and image acceleration for coding television image sequences. In: Watson AB (ed) Digital images and human vision. The MIT, Cambridge, pp 89–98Google Scholar
  16. 16.
    Fei X, Xiao L, Sun Y, Wei Z (2012) Perceptual image quality assessment based on structural similarity and visual masking. Signal Process Image Commun 27(7):772–783CrossRefGoogle Scholar
  17. 17.
    Geisler WS, Perry JS (1998) A real-time foveated multiresolution system for low bandwidth video communication. Proc SPIE 3299:294–305CrossRefGoogle Scholar
  18. 18.
    Gu K, Zhai G, Yang X, Zhang W (2014) Hybrid no-reference quality metric for singly and multiply distorted images. IEEE Trans Broadcast 60(3):555–567CrossRefGoogle Scholar
  19. 19.
    Joskowicz J, Sotelo R, Lopez Ardao JC (2013) Towards a general parametric model for perceptual video quality estimation. IEEE Trans Broadcast 59(4):569–579CrossRefGoogle Scholar
  20. 20.
    Lee B, Kim M (2013) No-reference PSNR estimation for HEVC encoded video. IEEE Trans Broadcast 59(1):20–27CrossRefGoogle Scholar
  21. 21.
    Lee S, Pattchis MS, Bovik AC (2002) Foveated video quality assessment. IEEE Trans Multimed 4(1):129–132CrossRefGoogle Scholar
  22. 22.
    Li S, Ma L, Ngan KN (2012) Full-reference video quality assessment by decoupling detail losses and additive impairments. IEEE Trans Circuits Syst Video Technol 22(7):1100–1112CrossRefGoogle Scholar
  23. 23.
    Lisberg SG, Evinger C, Johnson GW, Fuchs AF (1981) Relation between eye acceleration and retinal image velocity during foveal pursuit in man and monkey. J Neurophysiol 46(2):229–249CrossRefGoogle Scholar
  24. 24.
    Liu H, Heynderickx I (2011) Visual attention in objective image quality assessment: based on eye-tracking data. IEEE Trans Circuits Syst Video Technol 21(7):971–982CrossRefGoogle Scholar
  25. 25.
  26. 26.
    Ma L, Li S, Ngan KN (2012) Reduced-reference video quality assessment of compressed video sequences. IEEE Trans Circuits Syst Video Technol 22(10):1441–1456CrossRefGoogle Scholar
  27. 27.
    Masry MA, Hemami SS (2004) A metric for continuous quality evaluation of compressed video with severe distortions. Signal Process Image Commun 19(1):133–146CrossRefGoogle Scholar
  28. 28.
    McDonagh P, Pande A, Murphy L, Mohapatra P (2013) Toward deployable methods for assessment of quality for scalable IPTV services. IEEE Trans Broadcast 59(2):223–237CrossRefGoogle Scholar
  29. 29.
    Mittal A, Moorthy AK, Geisler WS, Bovik AC (2011) Task dependence of visual attention on compressed videos: points of gaze statistics and analysis. Proc SPIE 7685:78650T–786510CrossRefGoogle Scholar
  30. 30.
    Moorthy AK, Seshadrinathan K, Soundararajan R, Bovik AC (2010) Wireless video quality assessment: a study of subjective scores and objective algorithms. IEEE Trans Circuits Syst Video Technol 20(4):587–599CrossRefGoogle Scholar
  31. 31.
    Murthy AV, Karam LJ (2010) IVQUEST-Image and video quality evaluation software.
  32. 32.
    Murthy AV, Karam LJ (2010) A MATLAB based framework for image and video quality evaluation. Proc Int Work Qual Multimed Exp QoMEX 2010:242–247Google Scholar
  33. 33.
    Na T, Kim M (2014) A novel no-reference PSNR estimation method with regard to deblocking filtering effect in H.264/AVC bitstreams. IEEE Trans Circuits Syst Video Technol 24(2):320–330CrossRefGoogle Scholar
  34. 34.
    Narwaria M, Lin W, Liu A (2012) Low-complexity video quality assessment using temporal quality variations. IEEE Trans Multimed 14(3):525–535CrossRefGoogle Scholar
  35. 35.
    Osberger W, Rohaly AM (2001) Automatic detection of regions of interest in complex video sequences. Proc SPIE 4299:361–372CrossRefGoogle Scholar
  36. 36.
    Ou YF, Ma Z, Liu T, Wang Y (2011) Perceptual quality assessment of video considering both frame rate and quantization artifacts. IEEE Trans Circuits Syst Video Technol 21(3):286–298CrossRefGoogle Scholar
  37. 37.
    Park J, Seshadrinathan K, Lee S, Bovik AC (2013) Video quality pooling adaptive to perceptual distortion severity. IEEE Trans Image Process 22(2):610–620MathSciNetCrossRefzbMATHGoogle Scholar
  38. 38.
    Pinson MH, Wolf S (2004) A new standardized method for objectively measuring video quality. IEEE Trans Broadcast 50(3):312–322CrossRefGoogle Scholar
  39. 39.
    Pinson MH, Choi LK, Bovik AC (2014) Temporal video quality model accounting for variable frame delay distortions. IEEE Trans Broadcast 60(4):637–649CrossRefGoogle Scholar
  40. 40.
    Privitera CM, Stark LW (2000) Algorithms for defining visual regions-of-interest: comparison with eye fixation. IEEE Trans Pattern Anal 22(9):970–982CrossRefGoogle Scholar
  41. 41.
    Rajashekar U, Linde I, Bovik AC, Cormack LK (2008) GAFFE: a gaze-attentive fixation finding engine. IEEE Trans Image Process 17(4):564–573MathSciNetCrossRefGoogle Scholar
  42. 42.
    Rimac-Drlje S, Žagar D, Martinović G (2009) Spatial masking and perceived video quality in multimedia applications. Proc – Int Conf Syst, Signals and Image Proc IWSSIP 2009:1–4Google Scholar
  43. 43.
    Rimac-Drlje S, Vranješ M, Žagar D (2010) Foveated mean squared error – a novel video quality metric. Multimed Tools Appl 49:425–445CrossRefGoogle Scholar
  44. 44.
    Ryu S, Sohn K (2014) No-reference quality assessment for stereoscopic images based on binocular quality perception. IEEE Trans Circuits Syst Video Technol 24(4):591–602CrossRefGoogle Scholar
  45. 45.
    Seshadrinathan K, Bovik AC (2010) Motio-tuned spatio-temproal quality assessment of natural videos. IEEE Trans Image Process 19(2):335–350MathSciNetCrossRefzbMATHGoogle Scholar
  46. 46.
    Seshadrinathan K, Soundararajan R, Bovik AC, Cormack LK (2010) Study of subjective and objective quality assessment of video. IEEE Trans Image Process 19(6):1427–1441MathSciNetCrossRefzbMATHGoogle Scholar
  47. 47.
    Seshadrinathan K, Soundararajan R, Bovik AC, Cormack LK (2010) A subjective study to evaluate video quality assessment algorithms. Proc SPIE 7527:75270H–752710CrossRefzbMATHGoogle Scholar
  48. 48.
    Seyedebrahimi M, Bailey C, Peng XH (2013) Model and performance of a no-reference quality assessment metric for video streaming. IEEE Trans Circuits Syst Video Technol 23(12):2034–2043CrossRefGoogle Scholar
  49. 49.
    Sogaard J, Forchhammer S, Korhonen J (2015) No-reference video quality assessment using codec analysis. IEEE Trans Circuits Syst Video Technol 25(10):1637–1650CrossRefGoogle Scholar
  50. 50.
    Soundararajan R, Bovik AC (2013) Video quality assessment by reduced reference spatio-temporal entropic differencing. IEEE Trans Circuits Syst Video Technol 23(4):684–694CrossRefGoogle Scholar
  51. 51.
    Staelens N, De Meulenaere J, Claeys M, Van Wallendael G, Van den Broeck W, De Cock J, Van de Walle R, Demeester P, De Turck F (2014) Subjective quality assessment of longer duration video sequences delivered over HTTP adaptive streaming to tablet devices. IEEE Trans Broadcast 60(4):707–714CrossRefGoogle Scholar
  52. 52.
    Stealens N, Deschrijver D, Vladislavleva E, Vermuelen B, Dhaene T, Demeester P (2013) Constructing a no-reference H.264/AVC bitstream-based video quality metric using genetic programming-based symbolic regression. IEEE Trans Circuits Syst Video Technol 23(8):1322–1333CrossRefGoogle Scholar
  53. 53.
    Subjective Video Quality Assessment Methods for Multimedia Applications (1999) ITU-T Recommendation P.910, Geneve, Swiss.
  54. 54.
    Sun X, Yao H, Ji R, Liu XM (2014) Toward statistical modeling of saccadic eye-movement and visual saliency. IEEE Trans Image Process 23(11):4649–4662MathSciNetCrossRefzbMATHGoogle Scholar
  55. 55.
    van den Branden Lambrecht CJ, Verscheure O (1996) Perceptual quality measure using a spatio-temporal model of the human visual system. Proc SPIE 2668:450–461CrossRefGoogle Scholar
  56. 56.
    Van der Linde I, Rajashekar U, Bovik AC, Cormack LK (2009) DOVES: a database of visual eye movements. Spat Vis 22(2):161–177CrossRefGoogle Scholar
  57. 57.
    Video Quality Experts Group (2003) Final report from the video quality experts group on the validation of objective models of video quality assessment, Phase II. VQEG,
  58. 58.
    Vranješ M (2012) Objective image quality metric based on spatio-temporal features of video signal and foveated vision. PhD Thesis, Josip Juraj Strossmayer University of Osijek, CroatiaGoogle Scholar
  59. 59.
    Vranješ M, Rimac-Drlje S, Vranješ D (2012) ECVQ and EVVQ video quality databases. Proc – Int Symp Electron in Marine ELMAR 2012:13–17Google Scholar
  60. 60.
    Vranješ M, Rimac-Drlje S, Grgić K (2013) Review of objective video quality metrics and performance comparison using different databases. Signal Process Image Commun 28(1):1–19CrossRefGoogle Scholar
  61. 61.
    Wang Z, Bovik AC, Lu L, Kouloheris J (2001) Foveated wavelet image quality index. Proc SPIE 4472:1–11CrossRefGoogle Scholar
  62. 62.
    Wang Z, Simoncelli EP, Bovik AC (2003) Multi-scale structural similarity for image quality assessment (invitetd paper) Conf Record – Asilomar Conf Signals. Syst and Computers ACSSC 2003:1398–1402Google Scholar
  63. 63.
    Wang Z, Bovik AC, Sheikh H, Simoncelli E (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612CrossRefGoogle Scholar
  64. 64.
    Wang Y, Jiang T, Ma S, Lee KI (2012) Novel spatio-temporal structural information based video quality metric. IEEE Trans Circuits Syst Video Technol 22(7):989–998CrossRefGoogle Scholar
  65. 65.
    Winkler S (2005) Digital video quality: vision models and metrics. Wiley, ChichesterCrossRefGoogle Scholar
  66. 66.
    Winkler S, Mohandas P (2008) The evolution of video quality measurement: from PSNR to Hybrid Metrics. IEEE Trans Broadcast 54(3):660–668CrossRefGoogle Scholar
  67. 67.
    Wu HR, Rao KR (2006) Digital video image quality and perceptual coding. CRC Press, Taylor & Francis Group, Boca RatonGoogle Scholar
  68. 68.
    Wu Q, Li H, Meng F, Ngan KN, Luo B, Au OC, Huang C, Zeng B (2016) Blind image quality assessment based on multichannel feature fusion and label transfer. IEEE Trans Circuits Syst Video Technol 26(3):425–440CrossRefGoogle Scholar
  69. 69.
    Xu J, Ye P, Li Q, Du H, Liu Y, Doermann D (2016) Blind image quality assessment based on high order statistics aggregation. IEEE Trans Image Process 25(9):4444–4457MathSciNetCrossRefGoogle Scholar
  70. 70.
    Xue W, Mou X, Zhang L, Bovik AC, Feng X (2014) Blind image quality assessment using joint statistics of gradient magnitude and Laplacian features. IEEE Trans Image Process 23(11):4850–4862MathSciNetCrossRefzbMATHGoogle Scholar
  71. 71.
    Xue Y, Erkin B, Wang Y (2015) A-novel no-reference video quality metric for evaluating temporal jerkiness due to frame freezing. IEEE Trans Multimed 17(1):134–139CrossRefGoogle Scholar
  72. 72.
    Yan C, Zhang Y, Dai F, Li L (2013) Highly parallel framework for HEVC motion estimation on many-core platform. Proc - Data Comp Conf DCC 2013:63–72Google Scholar
  73. 73.
    Yan C, Zhang Y, Xu J, Dai F, Li L, Dai Q, Wu F (2014) A highly parallel framework for HEVC coding unit partitioning tree decision on many-core processors. IEEE Signal Process Lett 21(5):573–557CrossRefGoogle Scholar
  74. 74.
    Yan C, Zhang Y, Xu J, Dai F, Zhang J, Dai Q, Wu F (2014) Efficient parallel framework for HEVC motion estimation on many-core processors. IEEE Trans Circuits Syst Video Technol 24(12):2077–2089CrossRefGoogle Scholar
  75. 75.
    Yan C, Zhang Y, Dai F, Wang X, Li L, Dai Q (2014) Parallel deblocking filter for HEVC on many-core processor. Electron Lett 50(5):367–368CrossRefGoogle Scholar
  76. 76.
    Yan C, Zhang Y, Dai F, Zhang J, Li L, Dai Q (2014) Efficient parallel HEVC intra prediction on many-core processor. Electron Lett 50(11):805–806CrossRefGoogle Scholar
  77. 77.
    Yan C, Xie H, Yang D, Yin J, Zhang Y, Dai Q (2017) Supervised Hash Coding With Deep Neural Network for Environment Perception of Intelligent Vehicles. IEEE Trans. Intell. Transp. Syst.
  78. 78.
    Yan C, Xie H, Liu S, Yin J, Zhang Y, Dai Q (2017) Effective Uyghur Language Text Detection in Complex Background Images for Traffic Prompt Identification. IEEE Trans. Intell. Transp. Syst.
  79. 79.
    Yeh HH, Yang CY, Lee MS, Chen CS (2013) Video aesthetic quality assessment by temporal integration of photo- and motion-based features. IEEE Trans Multimed 15(8):1944–1957CrossRefGoogle Scholar
  80. 80.
    You J, Reiter U, Hannuksela MM, Gabbouj M, Perkis A (2010) Perceptual-based objective quality metrics for audio-visual services – a survey. Signal Process Image Commun 25(7):482–501CrossRefGoogle Scholar
  81. 81.
    You J, Korhonen J, Perkis A (2010) Attention modelling for video quality assessment: balancing global quality and local quality. Proc – Int Conf Multimed and Expo ICME 2010:914–919Google Scholar
  82. 82.
    You J, Ebrahimi T, Perkis S (2014) Attention driven foveated video quality assessment. IEEE Trans Image Process 23(1):200–213MathSciNetCrossRefzbMATHGoogle Scholar
  83. 83.
    Zegarra Rodriguez D, Lopes Rosa R, Costa Alfaia E, Issy Abrahao J, Bressan G (2016) Video quality metric for streaming service using DASH standard. IEEE Trans Broadcast 62(3):628–639CrossRefGoogle Scholar
  84. 84.
    Zhang F, Bull DR (2016) A perception-based hybrid model for video quality assessment. IEEE Trans Circuits Syst Video Technol 26(6):1017–1028CrossRefGoogle Scholar
  85. 85.
    Zhang L, Shen Y, Li H (2014) VSI: a visual saliency-induced index for perceptual image quality. IEEE Trans Image Process 23(10):4270–4281MathSciNetCrossRefzbMATHGoogle Scholar
  86. 86.
    Zhang L, Zhang L, Bovik AC (2015) A feature-enriched completely bling image quality evaluator. IEEE Trans Image Process 24(8):2579–2591MathSciNetCrossRefGoogle Scholar
  87. 87.
    Zhao Y, Yu L, Chen Z, Zhu C (2011) Video quality assessment based on measuring perceptual noise from spatial and temporal perspectives. IEEE Trans Circuits Syst Video Technol 21(12):1890–1902CrossRefGoogle Scholar
  88. 88.
    Zhu K, Li C, Asari V, Saupe D (2015) No-reference video quality assessment based on artifact measurement and statistical analysis. IEEE Trans Circuits Syst Video Technol 25(4):533–546CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2017

Authors and Affiliations

  1. 1.Faculty of Electrical Engineering, Computer Science and Information TechnologyUniversity of OsijekOsijekCroatia

Personalised recommendations