Abstract
The powerful parallel computing ability of Graphics Processing Unit (GPU) has shown its striking superiority for motion estimation acceleration in conventional hybrid video encoding process. Unfortunately, the motion information of the neighboring macroblocks is not available for current macroblock, such that parallel motion estimation using GPU is not very favored. To tackle this problem while achieving high acceleration ration, motion vector cost is always ignored in most existing solutions, which inevitably causes severe rate-distortion loss. In this paper, a novel motion vector extrapolation based approach (MVEA) is presented for enhancing rate-distortion performance of parallel motion estimation on GPU, which is based on the study of motion vector recovery strategies for frame loss error concealment. Furthermore, the efficient implementation of MVEA on Computing Unified Device Architecture (CUDA) is also investigated. Simulation results show that MVEA can achieve a maximum peak Signal-to-Noise ratio enhancement of 0.8 dB with ignorable computational cost increase.
Similar content being viewed by others
References
Boyer M, Tarjan D, Acton ST, Skadron K (2009) Accelerating leukocyte tracking using CUDA: a case study in leveraging manycore coprocessors. Proceedings of the International Symposium on Parallel & Distributed Processing (IPDPS 2009), Italy
Chan L, Lee J, Rothberg A, Weaver P (2009) Parallelizing H.264 motion estimation algorithm using CUDA. MIT IAP.
Chen WN, Hang HM (2008) H.264/AVC motion estimation implementation on compute unified device architecture (CUDA). Proceedings of IEEE International Conference on Multimedia and Expo (ICME 2008), Germany
Cheung NM, Au OC, Kung MC, Wong PHW, Liu CH (2009) Highly parallel rate-distortion optimized intra-mode decision on multicore graphics processors. IEEE Trans Circuits Syst Video Technol 19(11):1692–1703
Chien SY, Huang YW, Chen CY, Chen HH, Chen LG (2005) Hardware architecture design of video compression for multimedia communication systems. IEEE Commun Mag 43(8):122–131
Dikbas S, Arici T, Altunbasak Y (2010) Fast motion estimation with interpolation-free sub-sample accuracy. IEEE Trans Circuits Syst Video Technol 20(7):1047–1051
Fan J, Zhang X, Chen Y (2007) A new error concealment scheme for whole frame loss in video transmission. Picture Coding Symposium (PCS 2007), Portugal
Gan Z, Qi L, Zhu X (2007) Motion compensated frame interpolation based on H.264 decoder. Electron Lett 43(2):96–98
Gao Y, Yu SS, Po LM, Chen JZ (2010) Collocated macroblock based motion estimation for H.264/AVC on GPU. HKIE Trans 17(3):15–18
Generic Coding of Moving Pictures and Associated Audio Information, Part 2: Video (1995) ITU-T Rec. H.262, ISO/IEC 13818-2 (MPEG-2), ITU-T_and_ISO/IEC_JTC-1
Ho CW, Au OC, Chan SG, Yip SK, Wong HM (2006) Motion estimation for H.264/AVC using programmable graphics hardware. Proceedings of IEEE Conference on Multimedia & Expro (ICME), Canada
Jing X, Chau LP (2004) An efficient three-step search algorithm for block motion estimation. IEEE Trans Multimedia 6(2):435–438
Lin YC, Li PL, Hang CH, Wu CL, Tsao YM, Chien SY (2006) Multi-pass algorithm of motion estimation in video encoding for generic GPU. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS 2006), Greece
Ma M, Au O, Guo L, Chan S, Wong P (2008) Error concealment for frame losses in MDC. IEEE Trans Multimedia 10(8):1638–1647
Massanes F, Cadennes M, Brankov (2011) Compute-unified device architecture implementation of a block-matching algorithm for multiple graphical processing unit cards. J Electron Imag 20(3):033004
NVIDIA Corporation (2008) CUDA Programming Guide Version 2.0, USA
Nickolls J, Buck I, Garland M, Skadron K (2008) Scalable parallel programming with CUDA. Queue 6(2):40–53
Park SI, Ponce SP, Huang J, Cao Y, Quek F (2008) Low-cost, high-speed computer vision using NVIDIA’s CUDA Architecture. Applied Imagery Pattern Recognition Workshop (AIPR 2008), USA
Po LM, Guo K (2007) Transform-domain fast sum of the squared difference computation for H.264/AVC rate-distortion optimization. IEEE Trans Circuits Syst Video Technol 17(6):765–773
Po LM, Ting CW, Wong KM, Ng KH (2007) Novel point-oriented inner searches for fast block motion estimation. IEEE Trans Multimedia 9(1):9–15
Po LM, Wong KM, Cheung KW, Ng KH (2010) Subsampled block-matching for zoom motion compensated prediction. IEEE Trans Circuits Syst Video Technol 20(11):1625–1637
Schwalb M, Ewerth R, Freisleben B (2009) Fast motion estimation on graphics hardware for H.264 video encoding. IEEE Trans Multimedia 11(1):1–10
Shen G, Gao GP, Li S, Shum HY, Zhang YQ (2005) Accelerate video decoding with generic GPU. IEEE Trans Circuits Syst Video Technol 15(5):685–693
Steinbach E, Farber N, Girod B (1997) Standard compatible extension of H.263 for robust video transmission in mobile environments. IEEE Trans Circuits Syst Video Technol 7(6):872–881
Sullivan GJ, Wiegand T (1998) Rate-distortion optimization for video compression for video compression. IEEE Signal Proc Mag 15(11):74–90
Sullivan GJ, Wiegand T (1998) Rate-distortion optimization for video compression. IEEE Signal Proc Mag 15(11):74–90
Wiegand T, Sullivan GJ, Bjontegaard G, Luthra A (2003) Overview of the H.264/AVC video coding standard. IEEE Trans Circuits Syst Video Technol 13(7):560–576
Wu Z, Boyce JM (2006) An error concealment scheme for entire frame losses based on H.264/AVC. Proceedings of International Symposium on Circuits and Systems (ISCAS 2006), Greece
Yan B, Gharavi H (2008) Efficient error concealment for the whole-frame loss based on H.264/AVC. Proceedings of International Conference on Image Processing (ICIP 2008), USA
Acknowledgments
The authors are thankful for the writing help from Prof. Lai-Man Po at City University of Hong Kong. We also thank the anonymous reviewers for their time and valuable comments which helped improve the paper quality significantly.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gao, Y., Zhou, J. Motion vector extrapolation for parallel motion estimation on GPU. Multimed Tools Appl 68, 701–715 (2014). https://doi.org/10.1007/s11042-012-1074-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-012-1074-4