Design and Implementation of H.264/AVC Decoder

  • Kun-Bin Lee

18.1 Introduction

Recently multimedia and wired/wireless communication technologies have fundamentally changed the way we create, communicate, and consume audiovisual information. H.264/AVC, the latest international video coding standard [1], uses state-of-the-art coding tools and provides enhanced coding efficiency for a wide range of applications including mobile multimedia broadcasting, video conferencing, internet protocol television (IPTV), digital cinema, IP multimedia subsystem (IMS), surveillance, etc. As might be expected, the increase in coding efficiency and coding flexibility comes at the expense of an increase in complexity with respect to earlier standards. Realization of these applications relies on VLSI for cost-effective implementation.

In this chapter, we will discuss and review the issues and design techniques of H.264/AVC decoding systems. First, we will discuss the development issues of H.264/AVC decoding systems. The performance of software-based and...


Discrete Cosine Transform Clock Cycle Motion Compensation Memory Bandwidth Intra Prediction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    ITU-T Rec. H.264 and ISO/IEC 14496-10 (MPEG4-AVC), Advanced Video Coding for Generic Audiovisual Services, Mar.2005.Google Scholar
  2. 2.
    K.-B. Lee and T.-S. Chang, “Chapter 4: SoC memory system design,” Essential Issues in Soc Design: Designing Complex Systems-on-chip, pp. 73–118, Nov. 2006, Springer, Netherlands.Google Scholar
  3. 3.
    C. Lee, W. Mangione-Smith, and M. Potkonjak, “MediaBench: A tool for evaluating multimedia and communication systems,” in Proc. Int. Symp. Microarchitectures, pp. 330–335, Dec. 1997.Google Scholar
  4. 4.
    J. E. Fritts, F. W. Steiling, and J. A. Tucek, “MediaBench II Video: expediting the next generation of video systems research,” in Proc. SPIE, pp. 79–93, Jan. 2005.Google Scholar
  5. 5.
    J. Fritts, Architecture and compiler design issues in programmable media processors, Ph.D. Thesis, Dept. of Electrical Engineering, Princeton University, 2000Google Scholar
  6. 6.
    B. Bishop, T.P. Kelliher, and M.J. Irwin, “A detailed analysis of MediaBench,” in Proc. Signal Processing Systems (SiPS 99), pp. 448–455, Oct. 1999.Google Scholar
  7. 7.
    J.L. Hennessy and D.A. Patterson, Computer Architecture: A Quantitative Approach, 2nd ed., Morgan Kaufmann Publishers, San Francisco, 1996.MATHGoogle Scholar
  8. 8.
    K. Diefendorff and P.K. Dubey, “How multimedia workloads will change processor design,” IEEE Computer, vol. 30, no. 9, pp. 43–45, Sep. 1997.Google Scholar
  9. 9.
    T. M. Conte et al., “Challenges to combining general-purpose and multimedia processors,” IEEE Computer, pp. 33–37, Dec. 1997.Google Scholar
  10. 10.
    I. Kuroda and T. Nishitani, “Multimedia processors,” Proc. IEEE, vol. 86, pp. 1203–1221, Jun. 1998.CrossRefGoogle Scholar
  11. 11.
    S. Rixner et al., “A bandwidth-efficient architecture for media processing,” in Proc. ACM/ IEEE Int. Symp. Microarchitecture: IEEE CS'98, Nov.–Dec. 1998, pp. 3–13.CrossRefGoogle Scholar
  12. 12.
    Y. Oshima, B.J. Sheu, and S.H. Jen, “High-speed memory architectures for multimedia applications,” IEEE Circ. Devices Mag., vol. 13, pp. 8–13, Jan. 1997.CrossRefGoogle Scholar
  13. 13.
    B. Prince, High Performance Memories: New Architecture DRAMs and SRAMs, John Wiley & Sons, 2nd ed., Jul. 1999, England.Google Scholar
  14. 14.
    Y.-H. Park, S.-H. Han, J.-H. Lee, and H.-J. Yoo, “A 7.1-GB/s low-power rendering engine in 2-D array-embedded memory logic CMOS for portable multimedia system,” IEEE J. Solid-State Circuits, vol. 36, pp. 944–955, Jun. 2001.CrossRefGoogle Scholar
  15. 15.
    Ackland, “The role of VLSI in multimedia,” IEEE J. Solid-State Circuits, vol. 29, pp. 381–388, Apr. 1994.CrossRefGoogle Scholar
  16. 16.
    P. Pirsch, N. Demassieux, and W. Gehrke, “VLSI architectures for video compression— A survey,” Proc. IEEE, vol. 83, pp. 220–246, Feb. 1995.CrossRefGoogle Scholar
  17. 17.
    V. Baskaran and K. Konstantinides, Image and Video Compression Standards: Algorithms and Architecture, Kluwer Academic, Norwell, MA, 1995.Google Scholar
  18. 18.
    P. Pirsch and H.-J. Stolberg, “VLSI implementations of image and video multimedia processing systems,” IEEE Trans. Circuits Syst. Video Technol., vol. 8, pp. 878–891, Nov. 1998.CrossRefGoogle Scholar
  19. 19.
    F. Catthoor et al., Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design, Kluwer Academic Publishers, Boston, Sep. 1998.MATHGoogle Scholar
  20. 20.
    Francky Catthoor, Unified Low-power Design Flow for Data-dominated Multi-media and Telecom Applications, Kluwer Academic Publishers, Dordrecht, Jul. 2000.Google Scholar
  21. 21.
    D. Laneer, M. Cornero, G. Goosens, and H. De Man, “Data routing: A paradigm for efficient data-path synthesis and code generation,” in Proc. 7th Int. Symp. High-Level Synthesis, 1994, pp. 17–22.Google Scholar
  22. 22.
    S. Tarafdar and M. Leeser, “A data-centric approach to high-level synthesis,” IEEE Trans. Computer-Aided Design, vol. 19, no. 11, Nov. 2000, pp. 1251–1267.CrossRefGoogle Scholar
  23. 23.
    H. Samsom, F. Franssen, F. Catthoor, and H. De Man, “Verification of loop transformations for real time signal processing applications,” in VLSI Signal Process. VII, 1994, pp. 208–217.CrossRefGoogle Scholar
  24. 24.
    M. Cupak, F. Catthoor, and H.J. De Man, “Efficient system-level functional verification methodology for multimedia applications,” IEEE Des. Test. Comput., vol. 20, no. 2, pp. 56–64, Mar.–Apr. 2003.CrossRefGoogle Scholar
  25. 25.
    P.R. Panda, N. Dutt, A. Nicolau, Memory Issues in Embedded Systems-on-chip: Optimizations and Exploration, Kluwer Academic, Boston, 1999.Google Scholar
  26. 26.
    J.L. Hennessy and D.A. Patterson, Computer Architecture: A Quantitative Approach, 3rd ed., Morgan Kaufmann Publishers, San Francisco, 2002.MATHGoogle Scholar
  27. 27.
    Anthony Cataldo, MPU designers target memory to battle bottlenecks, EE Times, (10/19/01, available on
  28. 28.
    R.C. Schumann, “Design of the 21174 memory controller for DIGITAL personal workstations,” Digital Technical J., vol. 9, no. 2, pp. 57–70, 1997.Google Scholar
  29. 29.
    R. Goering, “Philips design team wins EDAC award,” EEdesign, May 30, 2002.Google Scholar
  30. 30.
    S. Dutta, R. Jensen, and A. Rieckmann, “Viper: A multiprocessor SoC for advanced set-top box and digital TV systems,” IEEE Des. Test. Comput., vol. 18, no. 5, pp. 21–31, Sep.– Oct. 2001.CrossRefGoogle Scholar
  31. 31.
    G. Martin and H. Chang, Winning the SoC Revolution: Experiences in Real Design, Kluwer Academic Publishers, Boston, Jun. 2003.Google Scholar
  32. 32.
    M. Horowitz, A. Joch, and F. Kossentini, “H.264/AVC baseline profile decoder complexity analysis,” IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 7, pp. 704–716, Jul. 2003.CrossRefGoogle Scholar
  33. 33.
    M. Alvarez, E. Salamí, A. Ramirez, and M. Valero, “A performance characterization of high definition digital video decoding using H.264/AVC,” IEEE International Symposium on Workload Characterization, pp. 24–33, Oct. 2005.Google Scholar
  34. 34.
    K. Ramkishor, “Media processor architectures from TI,” TI Developer Conference, Nov. 2004.Google Scholar
  35. 35.
    Diamond Standard VDO Video Engines Product Brief, ver. 2–2007, available on:
  36. 36.
    Y. Hu, A. Simpson, K. McAdoo, and J. Cush, “A high definition H.264/AVC hardware video decoder core for multimedia SoCs,” in Proc. IEEE Int. Symp. Consumer Electronics, pp. 385– 389, Sept. 2004.Google Scholar
  37. 37.
    C.C. Lin, et al, “A 160K gates/4.5 kb SRAM H.264 video decoder for HDTV applications,” IEEE J. Solid-State Circuits, vol. 42, no. 1, Jan 2007.Google Scholar
  38. 38.
  39. 39.
    Y.-W. Huang, B.-Y. Hsieh, T.-C. Chen, and L.-G. Chen, “Analysis, fast algorithm, and VLSI architecture design for H.264/AVC intra frame coder,” IEEE Trans. Circuits Syst. Video Tech-nol., vol. 15, no. 3, pp. 378–401, Mar. 2005.CrossRefGoogle Scholar
  40. 40.
    J.-W. Chen, C.-C. Lin, J.-I. Guo, and J.-S. Wang, “Low complexity architecture design of H.264 predictive pixel compensator for HDTV application,” in Proc. ICASSP, pp. 932–935, May 2006.Google Scholar
  41. 41.
    E. Sahin and I. Hamzaoglu, “An efficient intra prediction hardware architecture for H.264 video decoding,” in Proc. DSD, pp. 448–454, 2007.Google Scholar
  42. 42.
    M. Wien, “Variable block-size transforms for H.264/AVC,” IEEE Trans. Circuits Syst. Video Technol., vol. 13, no.7, pp. 604–613, 2003.CrossRefGoogle Scholar
  43. 43.
    R. Wang, M. Li, J. Li, and Y. Zhang, “High throughput and low memory access sub-pixel interpolation architecture for H.264/avc HDTV decoder,” IEEE Trans. Consumer Electron., vol. 51, no. 3, pp. 1006–1013, 2005.CrossRefGoogle Scholar
  44. 44.
    S.Z. Wang, T.A. Lin, T.M. Liu and C.Y. Lee, “A new motion compensation design for H.264/AVC decoder,” in Proc. ISCAS, pp. 4558–4561, 2005.Google Scholar
  45. 45.
    J.-W. Chen, C.-C. Lin, J.-I. Guo, and J.-S. Wang, “Low complexity architecture design of H.264 predictive pixel compensator for HDTV applications,” in Proc. ICASSP, vol. 3, pp. 932–935, May 2006.Google Scholar
  46. 46.
    A. Azevedo, B. Zatt, L. Agostini, and S. Bampi, “MoCHA: a bi-predictive motion compensation hardware for H.264/AVC decoder targeting HDTV,” in Proc. ISCAS, pp. 1617–1620, May 2007.Google Scholar
  47. 47.
    P. List et al., “Adaptive deblocking filter,” IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 7, pp. 614–619, Jul. 2003.CrossRefGoogle Scholar
  48. 48.
    Y.-W. Huang et al, “Architecture design for deblocking filter in H.264/JVT/AVC,” in Proc. Multimedia Expo., Jul. 2003, vol. 1, pp. 693–696.Google Scholar
  49. 49.
    C.-C. Cheng, T.-S. Chang, and K.-B. Lee, “An in-place architecture for the deblocking filter in H.264/AVC,” IEEE Trans. Circuits Syst. II: Express Briefs, vol. 53, no.7, pp. 530–534, Jul. 2006.CrossRefGoogle Scholar
  50. 50.
    K.Y. Min and J. W. Chong, “A memory and performance optimized architecture of deblocking filter in H.264/AVC,” Int. Conf. MultimediaUbiquitous Engineering, pp. 220–225, Seoul Korea, Apr. 2007.Google Scholar
  51. 51.
    C.M. Chen and C.H. Chen, “Window architecture for deblocking filter in H.264/AVC,” IEEE Int. Symp. Signal Processing Information Technol., pp. 338–342, Vancouver Canada, Aug. 2006.Google Scholar
  52. 52.
    B. Sheng, W. Gao, and D. Wu, “An implemented architecture of deblocking filter for H.264/AVC,” in Proc. Int. Conf. Image Processing, vol. 1, pp. 665–668, Oct. 2004,Google Scholar
  53. 53.
    H. Malvar, A. Hallapuro, M. Karczewicz, and L. Kerofsky, “Low-complexity transform and quantization in H.264/AVC,” IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 7, pp. 598–603, Jul. 2003.CrossRefGoogle Scholar
  54. 54.
    A. Madisetti and A.N.Willson, Jr., “A 100 MHz 2-D 8x8 DCT/IDCT processor for HDTV applications,” IEEE Trans. Circuits Syst. Video Technol., vol. 5, pp. 158–165, Apr. 1995.CrossRefGoogle Scholar
  55. 55.
    T.S. Chang, C.S. Kung, and C.W. Jen, “A simple processor core design for DCT/IDCT,” IEEE Trans. Circuits Syst. Video Technol., vol. 10, pp. 439–447, Apr. 2000.CrossRefGoogle Scholar
  56. 56.
    T.C. Wang, Y.W. Huang, H.C. Fang, and L.G. Chen, “Parallel 4 x 4 2-D transform and inverse transform architecture for MPEG-4 AVC/H. 264,” in Proc. IEEE ISCAS, pp. 800–803, May 2003.Google Scholar
  57. 57.
    K.H. Chen, J.I. Guo, and J.S. Wang, “A high-performance direct 2-D transform coding IP design for MPEG-4AVC/H.264,” IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 4, pp. 472–483, Aug. 2006.CrossRefGoogle Scholar
  58. 58.
    L. Agostini, R. Porto, J. Guntzel, and I.S. Silva, “High throughput multitransform and multi-parallelism IP for H.264/AVC video compression standard,” in Proc. IEEE ISCAS, pp. 5419– 5422, May 2006.Google Scholar
  59. 59.
    S.Y. Tseng and T.W. Hsieh, “A pattern-search method for H.264/AVC CAVLC decoding,” in Proc. ICME, July 2006, pp. 1073–1076.Google Scholar
  60. 60.
    Y.H. Moon, G.Y. Kim, and J.H. Kim, “An efficient decoding of CAVLC in H.264/AVC video coding standard,” IEEE Trans. Consumer Electron., vol. 51, no. 3, pp. 933–938, Aug. 2005.CrossRefGoogle Scholar
  61. 61.
    Y.H. Moon, “A new coeff-token decoding method with efficient memory access in H.264/AVC video coding standard,” IEEE Trans. Circuits Syst. Video Technol., vol. 17, no. 6, pp. 729–736, Jun. 2007.CrossRefGoogle Scholar
  62. 62.
    Y.H. Kim, Y.J. Yoo, J. Shin, B. Choi, and J. Paik, “Memory-efficient H.264/AVC CAVLC for fast decoding,” IEEE Trans. Consum. Electron., vol. 52, pp. 943–952, Aug. 2006.CrossRefGoogle Scholar
  63. 63.
    H.-C. Chang, C.-C. Lin, and J.-I. Guo, “A novel low-cost high-performance VLSI architecture for MPEG-4 AVC/H.264 CAVLC decoding,”in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), 2005, pp. 6110–6113.Google Scholar
  64. 64.
    J. Nikara, S. Vassiliadis, J. Takala, and P. Liuha, “Multiple-symbol parallel decoding for variable length codes,” IEEE Trans. Very Large Scale Integration Systems, vol. 12, no. 7, pp. 676–685, Jul. 2004.CrossRefGoogle Scholar
  65. 65.
    Y.N. Wen, G.L. Wu, S.J. Chen, and Y.H. Hu, “Multiple-symbol parallel CAVLC decoder for H.264/AVC,” in Proc. IEEE APCCAS, Dec. 2006, pp. 1240–1243.Google Scholar
  66. 66.
    Guo-Shiuan Yu, and Tian-Sheuan Chang, “A zero-skipping multi-symbol CAVLC decoder for MPEG-4 AVC/H.264,” in Proc. ISCAS, pp. 21–24, May 2006.Google Scholar
  67. 67.
    T.-H. Tsa, D.-L. Fang, and Y.-N. Pan, “A hybrid cavld architecture design with low complexity and low power considerations,” in Proc. ICME, pp. 1910–1913, Jul. 2007.Google Scholar
  68. 68.
    D. Marpe and T. Wiegand, “A highly efficient multiplication-free binary arithmetic coder and its application in video coding,” in Proc. ICIP, Barcelona, Spain, pp. 263–266, Sept. 2003.Google Scholar
  69. 69.
    W.B. Pennebaker, J.L. Mitchell, G.G. Langdon, and R.B. Arps, “An overview of the basic principles of the Q coder adaptive binary arithmetic coder,” IBM J. Res. Dev., vol. 32, pp. 717–726, Nov. 1988.CrossRefGoogle Scholar
  70. 70.
    J.L. Mitchell and W.B. Pennebaker, “Optimal hardware and software arithmetic coding procedures for the Q coder,” IBM J. Res. Dev., vol. 32, no. 6, pp. 727–736, Nov. 1988.CrossRefGoogle Scholar
  71. 71.
    D. Taubman and M.W. Marcellin, JPEG2000 Image Compression: Fundamentals, Standards and Practice, Kluwer, Boston, MA, 2002.Google Scholar
  72. 72.
    M. Tarui, M. Oshita, T. Onoye, and I. Shirakawa, “High-speed implementation of JBIG arithmetic coder,” in Proc. IEEE TENCON, vol. 2, 1999, pp. 1291–1294.Google Scholar
  73. 73.
    Y.-T. Hsiao, H.-D. Lin, K.-B. Lee, and C.-W. Jen, “High-speed memory-saving architecture for the embedded block coding in JPEG2000,” in Proc. Int. Symp. Circuits and Systems (ISCAS'02), Phoenix, USA, vol. 5, pp. 133–136, May 2002.Google Scholar
  74. 74.
    K.-K. Ong, W.-H. Chang, Y.-C. Tseng, Y.-S. Lee, and C.-Y. Lee, “A high throughput low cost context-based adaptive arithmetic codec for multiple standards,” in Proc. Int. Conf. Image Processing, 2002, vol. 1, pp. 872–875.CrossRefGoogle Scholar
  75. 75.
    G. Feygin, P.G. Gulak, and P. Chow, “Architectural advances in the VLSI implementation of arithmetic coding for binary image compression,” in Proc. Data Compression Conference (DCC '94), 1994, pp. 254–263.Google Scholar
  76. 76.
    J. Jiang and S. Jones, “Parallel design of arithmetic coding,” IEE Proceedings-E: Computer and Digital Techniques, vol. 141, pp. 327–333, Nov. 1994.CrossRefMATHGoogle Scholar
  77. 77.
    J. Jiang, “Parallel design of Q coders for bilevel image compression,” in Proc. Int. Conf. Parallel and Distributed Systems, 1994, pp. 230–235.Google Scholar
  78. 78.
    K.-B. Lee, J.-Y. Lin, and C.-W. Jen, “A multisymbol context-based arithmetic coding architecture for MPEG-4 shape coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 2, pp. 283–295, Feb. 2005.CrossRefGoogle Scholar
  79. 79.
    N. Brady, F. Bossen, and N. Murphy, “Context-based arithmetic encoding of 2D shape sequences,” in Proc. Int. Conf. Image Processing, Santa Barbara, CA, vol. I, pp. 29–32, Oct. 1997.CrossRefGoogle Scholar
  80. 80.
    Y. Yi and I.-C. Park, “High-speed H.264/AVC CABAC decoding,” IEEE Trans. Circuits Syst. Video Technol., vol. 17, no. 4, pp. 490–494, Apr. 2007.CrossRefGoogle Scholar
  81. 81.
    L. Li, Y. Song, T. Ikenaga, and S. Goto, “Hardware architecture design of CABAC codec for H.264/AVC,” in Proc. IEEE VLSI-DAT, Hsinchu, Taiwan, pp. 1–4, Apr. 2007.Google Scholar
  82. 82.
    J.-W. Chen and Y.-L. Lin, “A high-performance hardwired CABAC decoder,” in Proc. ICASSP, Honolulu, Hawaii, USA, pp. 37–40, Apr. 15–20, 2007.Google Scholar
  83. 83.
    W. Yu and Y. He, “A high performance CABAC decoding architecture,” IEEE Trans. Consumer Electron., vol. 51, no. 4, pp. 1352–1359, Nov. 2005.CrossRefGoogle Scholar
  84. 84.
    Y.-C. Yang et al., “A high throughput VLSI architecture design for H.264 context-based adaptive binary arithmetic decoding with lookahead parsing,” IEEE International Conference on Multimedia & Expo (ICME), pp. 357–360, Jul. 2006.Google Scholar
  85. 85.
    G. Goossens et al, “Synthesis of flexible IC architectures for medium throughput real-time signal processing,” J. VLSI Signal Processing, vol. 5, no. 4, 1993.Google Scholar
  86. 86.
    T.H. Meng, B. Gordon, E. Tsern, and A. Hung, “Portable video-on-demand in wireless communication,” in Proc. IEEE, vol. 83, no. 4, pp. 659–680, Apr. 1995.CrossRefGoogle Scholar
  87. 87.
    V. Tiwari, S. Malik, and A. Wolfe, “Power analysis of embedded software: a first step towards software power minimization,” in Proc. ICCAD, pp. 384–390, Nov. 1994.Google Scholar
  88. 88.
    M. Winzker, P. Pirsch, and J. Reimers, “Architecture and memory requirements for stand-alone and hierarchical MPEG2 HDTV-decoders with synchronous DRAMs,” in Proc. Int. Symp. Circuits Systems, pp. 609–612, Apr. 1995.Google Scholar
  89. 89.
    H. Kim and I.-C. Park, “Array address translation for SDRAM-based video processing applications,” Electron. Lett., vol. 35, pp. 1929–1931, Oct. 1999.CrossRefGoogle Scholar
  90. 90.
    H. Kim and I.-C. Park, “High-performance and low-power memory-interface architecture for video processing applications,” IEEE Trans. Circuits Syst. Video Technol., vol. 11, no. 11, pp. 1160–1170, Nov. 2001.CrossRefGoogle Scholar
  91. 91.
    E.G.T. Jaspers and P.H.N. de With, “Bandwidth reduction for video processing in consumer systems,” IEEE Trans. Consum. Electron., vol. 47, no. 4, pp. 885–894, Nov. 2001.CrossRefGoogle Scholar
  92. 92.
    C.-H. Li, W.-H. Peng, and T. Chiang, “Design of memory sub-system with constant-rate bumping process for H.264/AVC decoder,” IEEE Trans. Consum. Electron., vol. 53, no. 1, pp. 209–217, 2007.CrossRefGoogle Scholar
  93. 93.
    K.-B. Lee, T.-C. Lin, and C.-W. Jen, “An efficient quality-aware memory controller for multimedia platform SoC,” IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 5, pp. 620–633, May 2005.CrossRefGoogle Scholar
  94. 94.
    T.-M. Liu et al., “A 125 mW, fully scalable MPEG-2 and H.264/AVC video decoder for mobile applications,” IEEE J. Solid-State Circuits, vol. 42, no. 1, pp. 161–169, Jan. 2007.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Kun-Bin Lee
    • 1
  1. 1.MediaTek Inc., No.1, Dusing RD.1Science-based Industrial ParkHsinchuR.O.C.

Personalised recommendations