Multimedia Tools and Applications

, Volume 77, Issue 23, pp 31067–31093 | Cite as

Saturation-aware human attention region of interest algorithm for efficient video compression

  • Sylvia O. N’guessanEmail author
  • Nam Ling


We propose a saturation-aware human attention region-of-interest (SA-HAROI) video compression method that performs a perceptual adaptive quantization algorithm on video frames as a function of the distribution of their luminance, motion vector, and color saturation. Our work is an application of a psycho-visual study that demonstrated that human attention automatically enhanced perceived saturation. Consequently, the adaptive quantization phase of our compression algorithm is characterized by a luminance and saturation-aware just noticeable distortion (JND) function. After running multiple experiments on 18 videos with various resolutions ranging from QCIF to 4 K, results showed that our method achieves higher compression than that of both the H.264/AVC JM and the HEVC HM while maintaining subjective quality. We observed that in comparison to both implementation of the standards (JM and HM), for an IPPP coding structure, the performance of our algorithm culminated with HD and 4 K videos yielding a bit rate reduction averaging 15% and an encoding time reduction of about 20% in certain cases. Finally, after comparing our method to other similar techniques, we concluded that saturation is a significant parameter in the improvement of video compression.


Adaptive perceptual quantization Human attention Human visual system Just-noticeable distortion Region-of-interest Saturation Streaming media Video coding Visual communication 


  1. 1.
    Agrafiotis D, Bull DR, Canagarajah N, Kamnoonwatana N (2006) Multiple priority region of interest coding with H.264. Proceedings of the IEEE International Conference on Image Processing, pp. 53–56Google Scholar
  2. 2.
    Al-Rahayfeh A, Faezipour M (2013) Enhanced frame rate for real-time eye tracking using circular hough transform. Systems, Applications and Technology Conference (LISAT), pp. 1–6Google Scholar
  3. 3.
    Bjøntegaard G (2001) Calculation of average PSNR difference between RD-curves Technical Report VCEG-M33, ITU-T SG16/Q6, AustinGoogle Scholar
  4. 4.
    Blake R, Sekuler R (2006) Perception. McGraw-HillGoogle Scholar
  5. 5.
    Blake R, Sekuler R, Grossman E (2003) Motion Processing in Human Visual Cortex. In: Kaas CCJH (Ed.), Boca Raton: CRC PressGoogle Scholar
  6. 6.
    Camgoz N (2000) Effects of Hue saturation and brightness on attention and preferenceGoogle Scholar
  7. 7.
    Carrasco M (2006) Covert attention increases contrast sensitivity: psychophysical, neurophysiological and neuroimaging studies. Prog Brain Res 154:33–70CrossRefGoogle Scholar
  8. 8.
    Chen Z, Liu H (2014) JND modeling: Approaches and applications. 19th International Conference on Digital Signal Processing (DSP), pp. 827–830Google Scholar
  9. 9.
    Chen Z, Lin W, Ngan KN (2010) Perceptual video coding: challenges and approaches. IEEE International Conference on Multimedia and Expo (ICME), pp. 784–789Google Scholar
  10. 10.
    Darrell T, Gordon G, Woodfill J, Baker H, Harville M (1998) Robust, real-time people tracking in open environments using integrated stereo, color, and face detection. IEEE Workshop on Visual Surveillance, pp. 26–32Google Scholar
  11. 11.
    Dongheng L, Winfield D, Parkhurst D (2005) Starburst: A hybrid algorithm for video-based eye tracking combining feature-based and model-based approaches. Computer Vision and Pattern Recognition (CVPR) Workshops, p. 79Google Scholar
  12. 12.
    Fang Y, Lin W, Lee B, Lau C, Chen Z, Lin C (2012) Bottom-Up Saliency Detection Model Based on Human Visual Sensitivity and Amplitude Spectrum. IEEE Transactions on Multimedia 14(1):2072–2086CrossRefGoogle Scholar
  13. 13.
    Feng L, Xiayu L, Yi C (2014) An efficient detection method for rare colored capsule on RGB and HSV color space. IEEE International Conference on Granular Computing (GrC), pp. 175–178Google Scholar
  14. 14.
    Forte D, Srivastava A (2011) Energy-aware video storage and retrieval in server environments. Green Computing Conference and Workshops (IGCC), OrlandoCrossRefGoogle Scholar
  15. 15.
    Fullera S, Carrasco M (2006) Exogenous attention and color perception: Performance and appearance of saturation and hue. Vis Res 46(23):4032–4047CrossRefGoogle Scholar
  16. 16.
    Gitman Y, Erofeev M, Vatolin D, Andrey B, Alexey F (2014) Semiautomatic visual-attention modeling and its application to video compression. IEEE International Conference on Image Processing (ICIP), pp. 1105–1109Google Scholar
  17. 17.
    Gonzalez R, Woods R (2008) Digital Image Processing. Prentice-Hall, New Jersey, pp 35–45Google Scholar
  18. 18.
    Grois D, Kaminsky E, Hadar O (2010) Adaptive bit-rate control for region-of-interest scalable video coding. Electrical and Electronics Engineers in Israel (IEEEI)Google Scholar
  19. 19.
    Hadizadeh H, Bajic I (2014) Saliency-Aware Video Compression. IEEE Trans Image Process 23(1):19–33MathSciNetCrossRefGoogle Scholar
  20. 20.
    Harezlak K, Kasprowski P, Stasch M (2014) Towards Accurate Eye Tracker Calibration – Methods and Procedures. 18th Annual Conference Knowledge-Based and Intelligent Information & Engineering Systems (KES)Google Scholar
  21. 21.
    High Efficiency Video Coding (2014). Available:
  22. 22.
    Hrarti M, Saadane H, Larabi M, Tamtaoui A, Aboutajdine D (2011) Adaptive quantization based on saliency map at frame level of H.264/AVC rate control scheme. 3rd European Workshop on Visual Information Processing (EUVIP), pp. 61–66Google Scholar
  23. 23.
    HSL and HSV (2016). Available: Accessed Apr 2016]
  24. 24.
    G. J. S. a. T. Wiegand (2005) Video compression – from concepts to the H.264/AVC standard. Proc. of the IEEE, 93(1):18–31Google Scholar
  25. 25.
    Itti L (2000) Models of Bottom-up eand top-down visual attention. EPasadenaGoogle Scholar
  26. 26.
    Itti L (2004) Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Trans Image Process 13(10):1304–1318CrossRefGoogle Scholar
  27. 27.
    JM Software (2015). Available: Accessed 2015
  28. 28.
    Jong-Yun K, Yi CH, Kim T (2010) ROI-centered compression by adaptive quantization for sports video. IEEE Trans Consum Electron 56(2):951–956CrossRefGoogle Scholar
  29. 29.
    Judd T, Ehinger K, Durand F, Torralba A (2004) Multimedia coding using adaptive regions of interest. Seminar on Neural Network Applications in Electrical Engineering, pp. 67–71Google Scholar
  30. 30.
    Judd T, Ehinger K, Durand F, Torralba A (2009) Learning to Predict Where Humans Look. 12th International Conference on Computer Vision, pp. 2106–2113Google Scholar
  31. 31.
    Lambretch C, Verscheure OD (1996) Perceptual quality measure using a spatio-temporal model of the human visual system. SPIE 2668 Proceedings, pp. 450–461Google Scholar
  32. 32.
    Ma Y-F, Hua X-S, Lu L, Zhang H-J (2005) A generic framework of user attention model and its application in video summarization. IEEE Transactions on Multimedia 7(5):907–919CrossRefGoogle Scholar
  33. 33.
    Makar M, Mavlankar A, Agrawal P, Girod B (2010) Real-time video streaming with interactive region-of-interest. Proceedings of the IEEE International Conference on Image Processing (ICIP), pp. 4437–4440Google Scholar
  34. 34.
    Myers RL (2003) Display Interface: Fundamentals and StandardsGoogle Scholar
  35. 35.
    N’guessan S Ling N. SA-HAROI Experimental Results Available:
  36. 36.
    Naccari M, Mrak M (2013) Intensity dependent spatial quantization with application in HEVC. IEEE International Conference on Multimedia and Expo (ICME)Google Scholar
  37. 37.
    N'guessan SO, Ling N (2012) Human attention region-of interest in I-frame for video coding. Visual Communications and Image Processing (VCIP), San DiegoCrossRefGoogle Scholar
  38. 38.
    N'guessan SO, Gu Z, Ling N (2014) Compression of HD videos by a contrast-based human attention algorithm. Multimedia Signal Processing (MMSP), JakartaCrossRefGoogle Scholar
  39. 39.
    Osberge W, Rohaly AM (2001) Automatic detection of regions of interest in complex video sequences. 4299:361–372Google Scholar
  40. 40.
    Posner M (1980) Orienting of attention. The Quarterly Journal of Experimental Psychology 3–25CrossRefGoogle Scholar
  41. 41.
    Pudlewski S, Melodia T (2013) Compressive video streaming: design and rate-energy-distortion-analysis. IEEE Transactions on Multimedia 8(15):2072–2086CrossRefGoogle Scholar
  42. 42.
    Rajeshwari J, Karibasappa K, GopalKrishna M (2014) Survey on skin based face detection on different illumination, poses and occlusion. International Conference on Contemporary Computing and Informatics, pp. 728–733Google Scholar
  43. 43.
    Recommendation BT.500-12(09/09) (2009). Available:
  44. 44.
    Robinson DA (1965) The mechanics of human smooth pursuit eye movement. J Physiol 180(3):569–591CrossRefGoogle Scholar
  45. 45.
    Shioiri S, Inoue T, Matsumura K, Taguchi H (1999) Movement of Visual Attention. Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics 2:5–9Google Scholar
  46. 46.
    Soyak E, Tsaftaris S, Katsaggelos A (2011) Low-Complexity Tracking-Aware H.264 Video Compression for Transportation Surveillance. IEEE Transactions on Circuits and Systems for Video Technology (CSVT) 21(10):1378–1389CrossRefGoogle Scholar
  47. 47.
    Stuart GW, Barsdell WN, Day RH (2014) The role of lightness, hue and saturation in feature-based visual attention. Vis Res 96:25–32CrossRefGoogle Scholar
  48. 48.
    Sullivan GJ, Ohm J-R, Han W-J, Wiegand T (2012) Overview of the High Efficiency Video Coding (HEVC) standard. IEEE Trans on Circuits and Syst for Video Tech 22(12):1649–1668CrossRefGoogle Scholar
  49. 49.
    Thibos LN (1989) Image processing by the human eye. Proceedings of SPIE 1199Google Scholar
  50. 50.
    Tobii (2015) Tobii is the world leader in eye tracking. Available: Accessed Oct 2015
  51. 51.
    Tsai CM, Lin CW, Lin W, Peng WH (2009) A comparative study on attention-based rate adaptation for scalable video coding. 16th IEEE International Conference on Image Processing (ICIP) 2009:969–972Google Scholar
  52. 52.
    Tzvetanka I (2003) Detecting Cartoons: A case study in automatic video-genre classification. Available:
  53. 53.
    Willenbockel V (2010) Does color ceontrast guide human overt attention?. Publications of the Institute of Cognitive ScienceGoogle Scholar
  54. 54.
    Wright RD, Ward LM (2008) Orienting of attention Google Scholar
  55. 55.
    Wu G-L, Wu T-H, Fu Y-J, Chien S-Y (2010) Perception-aware H.264/AVC encoder with hardware perception analysis engine. IEEE International Conference on Multimedia and Expo (ICME), pp. 790–795Google Scholar
  56. 56.
    Xiang G, Jia H, Huang X, Liu J, Wei K, Bai Y, Liao P, Gao W (2014) An Adaptive Perceptual Quantization Algorithm Based on Block-Level JND for Video Coding. Advances in Multimedia Information Processing (PCM) 8879:54–63Google Scholar
  57. 57.
    Zenyu W, Ngan KN (2009) Spatio-Temporal Just Noticeable Distortion Profile for Grey Scale Image/Video in DCT Domain. IEEE Transactions on Circuits and Systems for Video Technology 19(3):337–346CrossRefGoogle Scholar
  58. 58.
    Zheng H, Lu Y, Feng X (2015) Improved compression algorithm based on region of interest of face. Proceedings of the International Conference on Artificial Reality and Telexistence Workshops, pp. 345–348Google Scholar
  59. 59.
    Zhu J, Chen Z (2015) Real Time Face Detection System Using Adaboost and Haar-like Features. 2nd International Conference on Information Science and Control Engineering (ICISCE), pp. 404–407Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer EngineeringSanta Clara UniversitySanta ClaraUSA

Personalised recommendations