Multimedia Tools and Applications

, Volume 53, Issue 2, pp 431–457 | Cite as

Evaluation of data-parallel H.264 decoding approaches for strongly resource-restricted architectures

  • Florian H. SeitnerEmail author
  • Michael Bleyer
  • Margrit Gelautz
  • Ralf M. Beuschel


Decoding of an H.264 video stream is a computationally demanding multimedia application which poses serious challenges on current processor architectures. For processors with strongly limited computational resources, a natural way to tackle this problem is the use of multi-core systems. The contribution of this paper lies in a systematic overview and performance evaluation of parallel video decoding approaches. We focus on decoder splittings for strongly resource-restricted environments inherent to mobile devices. For the evaluation, we introduce a high-level methodology which can estimate the runtime behaviour of multi-core decoding architectures. We use this methodology to investigate six methods for accomplishing data-parallel splitting of an H.264 decoder. These methods are compared against each other in terms of runtime complexity, core usage, inter-communication and bus transfers. We present benchmark results using different numbers of processor cores. Our results shall aid in finding the splitting strategy that is best-suited for the targeted hardware-architecture.


Video decoding H.264/AVC Multimedia Multi-core Embedded architectures 



This work has been supported by the Austrian Federal Ministry of Transport, Innovation, and Technology under the FIT-IT project VENDOR (Project nr. 812429). Michael Bleyer would like to acknowledge the Austrian Science Fund (FWF) for financial support under project P19797.


  1. 1.
    Ball T, Larus JR (1994) Optimally profiling and tracing programs. ACM Trans Program Lang Syst 16(4):1319–1360CrossRefGoogle Scholar
  2. 2.
    Cesario WO, Lyonnard D, Nicolescu G, Paviot Y, Yoo S, Jerraya AA, Gauthier L, Diaz-Nava M (2002) Multiprocessor SoC platforms: a component-based design approach. IEEE Des Test Comput 19(6):52–63CrossRefGoogle Scholar
  3. 3.
    Chen TW, Huang YW, Chen TC, Chen YH, Tsai CY, Chen LG (2005) Architecture design of H.264/AVC decoder with hybrid task pipelining for high definition videos. In: Proc. of the IEEE int. symposium on circuits and systems, pp 2931–2934Google Scholar
  4. 4.
    Chen YK, Tian X, Ge S, Girkar M (2004) Towards efficient multi-level threading of H.264 encoder on Intel hyper-threading architectures. In: Proc. of the 18th int. parallel and distributed processing symposium, vol 1, pp 63–72Google Scholar
  5. 5.
    Cmelik B, Keppel D (1994) Shade: a fast instruction-set simulator for execution profiling. In: Proc. of the ACM SIGMETRICS conference on measurement and modeling of computer systems, pp 128–137Google Scholar
  6. 6.
    Faichney J, Gonzalez R (2001) Video coding for mobile handheld conferencing. J Multimed Tools Appl 13(2):165–176CrossRefzbMATHGoogle Scholar
  7. 7.
    Graham SL, Kessler PB, McKusick MK (1982) gprof: a call graph execution profiler. In: Proc. of the SIGPLAN symposium on compiler construction, pp 120–126Google Scholar
  8. 8.
    Gulliver SR, Ghinea G, Patel M, Serif T (2007) A context-aware tour guide: user implications. Mobile Inform Syst 3(2):71–88Google Scholar
  9. 9.
    ITU-T, ISO/IEC (2005) Advanced video coding for generic audiovisual services (ITU Rec. H.264 | ISO/IEC 14496-10). ITU-T and ISO/IECGoogle Scholar
  10. 10.
    Jeon J, Kim H, Boo G, Song J, Lee E, Park H (2000) Real-time MPEG-2 video codec system using multiple digital signal processors. J Multimed Tools Appl 11(2):197–214CrossRefGoogle Scholar
  11. 11.
    Knudsen PV, Madsen J (1996) Pace: A dynamic programming algorithm for hardware/software partitioning. In: Proc. of the int. workshop on hardware–software co-design, pp 85–92Google Scholar
  12. 12.
    Malik S, Martonosi M, Li YTS (1997) Static timing analysis of embedded software. In: Proc. of the 34th ACM/IEEE design automation conference, pp 147–152Google Scholar
  13. 13.
    Meenderinck C, Azevedo A, Juurlink B, Alvarez M, Ramirez A (2008) Parallel scalability of video decoders. J Signal Process Syst 56:173–194Google Scholar
  14. 14.
    Moriyoshi T, Miura S (2008) Real-time H.264 encoder with deblocking filter parallelization. In: IEEE int. conference on consumer electronics, pp 63–64Google Scholar
  15. 15.
    Nachtergaele L, Catthoor F, Kapoor B, Janssens S, Moolenaar D (1996) Low power storage exploration for H.263 video decoder. In: Proc. of the IX workshop on VLSI signal processing, pp 115–124. doi: 10.1109/VLSISP.1996.558310
  16. 16.
    Paver N, Khan M, Aldrich B (2006) Optimizing mobile multimedia using SIMD techniques. J Multimed Tools Appl 28(2):221–238CrossRefGoogle Scholar
  17. 17.
    Puschner PP, Koza C (1989) Calculating the maximum execution time of real-time programs. J Real-Time Syst 1(2):159–176CrossRefGoogle Scholar
  18. 18.
    Ravasi M, Mattavelli M (2003) High-level algorithmic complexity evaluation for system design. J Systems Archit 48(13–15):403–427CrossRefGoogle Scholar
  19. 19.
    Ravasi M, Mattavelli M (2005) High abstraction level complexity analysis and memory architecture simulations for multimedia algorithms. IEEE Trans Circuits Syst Video Technol 15(5):673–684CrossRefGoogle Scholar
  20. 20.
    Rodriguez A, Gonzalez A, Malumbres M (2006) Hierarchical parallelization of an H.264/AVC video encoder. In: Proc. of the int. symposium on parallel computing in electrical engineering, pp 363–368Google Scholar
  21. 21.
    Schöffmann K, Fauster M, Lampl O, Böszörményi L (2007) An evaluation of parallelization concepts for baseline-profile compliant H.264/AVC decoders. In: Proc. of the Euro-Par 2007, pp 782–791Google Scholar
  22. 22.
    Seitner F, Meser J, Schedelberger G, Wasserbauer A, Bleyer M, Gelautz M, Schutti M, Schreier R, Vaclavik P, Krottendorfer G, Truhlar G, Bauernfeind T, Beham P (2008) Design methodology for the SVENm multimedia engine. In: Proc. of the Austrochip 2008, poster presentationGoogle Scholar
  23. 23.
    Seitner FH, Schreier RM, Bleyer M, Gelautz M (2008) A high-level simulator for the H.264/AVC decoding process in multi-core systems. In: Proc. of the SPIE, multimedia on mobile devices, vol 6821, pp 5–16Google Scholar
  24. 24.
    Sun S, Wang D, Chen S (2007) A highly efficient parallel algorithm for H.264 encoder based on macro-block region partition. In: Proc. of the 3rd int. conference on high performance computing and communications, pp 577–585Google Scholar
  25. 25.
    van der Tol EB, Jaspers EG, Gelderblom RH (2003) Mapping of H.264 decoding on a multiprocessor architecture. In: Proc. of the SPIE, vol 5022, pp 707–718Google Scholar
  26. 26.
    Wang SH, Peng WH, He Y, Lin GY, Lin CY, Chang SC, Wang CN, Chiang P (2003) A platform-based MPEG-4 advanced video coding (AVC) decoder with block-level pipelining. In: Proc. of the 2003 joint conference of the 4th int. conference on information, communications and signal processing and the 4th Pacific rim conference on multimedia, vol 1, pp 51–55Google Scholar
  27. 27.
    Witchel E, Rosenblum M (1996) Embra: fast and flexible machine simulation. In: Proc. of the ACM SIGMETRICS int. conference on measurement and modeling of computer systems, pp 68–79Google Scholar
  28. 28.
    Zhao Z, Liang P (2006) A highly efficient parallel algorithm for H.264 video encoder. In: Proc. of the 31st IEEE int. conference on acoustics, speech, and signal processing, vol 5, pp 489–492Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Florian H. Seitner
    • 1
    Email author
  • Michael Bleyer
    • 1
  • Margrit Gelautz
    • 1
  • Ralf M. Beuschel
    • 1
  1. 1.Institute for Software Technology and Interactive SystemsVienna University of TechnologyViennaAustria

Personalised recommendations