Journal of Real-Time Image Processing

, Volume 16, Issue 2, pp 255–269 | Cite as

High speed on-chip multiple cosine transform generator

  • Yasser IsmailEmail author
  • Ahmed Abdelgawad
  • Sherif El-etriby
Original Research Paper


Speeding up video encoding process is an important issue nowadays due to the high demand of increasing the video transmission speed for real-time video applications. Many algorithms and hardware implementations have been proposed for achieving such demand using pixel domain motion estimation. Moreover, much savings in computations and hardware complexity can be achieved using frequency domain motion estimation (FDME). The key issues in the FDME encoder is to generate four different transforms that are used for deciding the best match motion vector in the motion estimation (ME) process. The related transformed-discrete cosine transform (RT-DCT) generator is responsible for generating such transforms in the FDME encoder. In this paper, a fast RT-DCT generator is proposed to speed up the FDME encoding time by reducing its complexity. That makes the proposed RT-DCT appropriated for real-time video applications with limited resources, such as mobile internet device, cellular phones, personal digital assistant (PDA), and ultra-mobile personal computer. Moreover, the proposed architecture does not scarify the resolution. By adding the proposed RT-DCT generator to the whole FDME system, implementation and simulation results show that the system performs ME for SDTV video sequence at 371 MHz.


Frequency domain motion estimation (FDME) Unrolled parallel CORDIC architecture Discrete cosine transform (DCT) coding 



The authors acknowledge the support of the Institute of Scientific Research and Revival of Islamic Heritage—Umm Al-Qura University—KSA for supporting this work under the project number 43308016. The authors also acknowledge the support of the Deanship of Scientific Research—University of Bahrain—Kingdom of Bahrain. The authors thank the editor in chief, the associate editor, and the reviewers of the Real-Time Image Processing journal for their valuable and constructive comments that helped to enhance the quality of this paper.


  1. 1.
    Generic coding of moving pictures and associated audio information—part 2: video. Int. Telecommun. Union-Telecommun. (ITU-T) and Int. Standards Org./Int. Electrotech. Commun. (ISO/IEC) JTC 1, Recommendation H.262 and ISO/IEC 13 818-2 (MPEG-2 Video), Nov. (1994)Google Scholar
  2. 2.
    Eckart, S., Fogg, C.: ISO/IEC MPEG-2 software video codec. In: Proc. SPIE, vol. 2419, pp. 100–118 (1995)Google Scholar
  3. 3.
    Information technology—coding of audio visual objects. ISO/IEC 14496-2 (MPEG-4 Video) (1999)Google Scholar
  4. 4.
    Wiegand, T., Sullivan, G.J., Bjontegaard, G., Luthra, A.: Overview of the H.264/AVC video coding standard. IEEE Trans. Circuits Syst. Video Technol. 13, 560–576 (2003)CrossRefGoogle Scholar
  5. 5.
    Goel, S., Ismail, Y., Bayoumi, M.A.: Adaptive search window size algorithm for fast motion estimation in H.264/AVC standard. Midwest Symp. Circuits Syst. 2, 1557–1560 (2005)Google Scholar
  6. 6.
    Ismail, Y., El-Medany, W., Al-Junaid, H., Abdelgawad, A.: High performance architecture for real-time HDTV broadcasting. J. Real-Time Image Process. (2014). doi: 10.1007/s11554-014-0430-1 Google Scholar
  7. 7.
    Ismail, Y., Elgamel, M., Bayoumi, M.: Fast variable padding motion estimation using smart zero motion prejudgment technique for pixel and frequency domains. IEEE Trans. Circuits Syst. Video Technol. 19, 609–626 (2009)CrossRefGoogle Scholar
  8. 8.
    Saponara, S., Martina, M., Casula, M., Fanucci, L., Masera, G.: Motion estimation and CABAC VLSI co-processors for real-time high-quality H.264/AVC video coding. Microprocess. Microsyst. 34, 316–328 (2010)CrossRefGoogle Scholar
  9. 9.
    Saponara, S., Fanucci, L.: Data-adaptive motion estimation algorithm and VLSI architecture design for low-power video systems. IEE Proc.-Comput. Digit. Tech. 151(1), 51–59 (2004)CrossRefGoogle Scholar
  10. 10.
    Ismail, Y., Elgamel, M., Bayoumi, M.: Adaptive techniques for a fast frequency domain motion estimation. IEEE Workshop Signal Process. Syst. 331–336 (2007)Google Scholar
  11. 11.
    Koc, U.V., Liu, K.J.R.: DCT-based motion estimation. IEEE Trans. Image Process. 7, 948–965 (1998)CrossRefGoogle Scholar
  12. 12.
    Li, M., Biswas, M., Kumar, S., Nguyen, T.: DCT-based phase correlation motion estimation. IEEE Int. Conf. Image Process. 1, 445–448 (2004)Google Scholar
  13. 13.
    Kughlin, C.D., Hines, D.C.: The phase correlation image alignment method. Proc. IEEE Int. Man Cybern. 163–165 (1975)Google Scholar
  14. 14.
    Erdem, C.E., Karabulut, G.Z., Yanmaz, E., Anarim, E.: Motion estimation in the frequency domain using fuzzy c-planes clustering. IEEE Trans. Image Process. 10(12), 1873–1879 (2001)CrossRefzbMATHGoogle Scholar
  15. 15.
    Shen, Y., Dang, J., Lei, T., Luo, W.: Motion blur parameters estimation based on frequency and spatial domain analysis. In: 2nd International Conference on Consumer Electronics, Communications and Networks (CECNet), 387–390 (2012)Google Scholar
  16. 16.
    Huang, J., Zhang, X., Zhou, Q., Song, E., Li, B.: A practical fundamental frequency extraction algorithm for motion parameters estimation of moving targets. IEEE Trans. Instrum. Meas. 63(2), 267–276 (2014)CrossRefGoogle Scholar
  17. 17.
    Wu, B.-F., Peng, H.-Y., Yu, T.-L.: Efficient hierarchical motion estimation algorithm and its VLSI architecture. IEEE Trans. Very Large Scale Integr. Syst. 16, 1385–1398 (2008)CrossRefGoogle Scholar
  18. 18.
    Varatkar, G.V., Shanbhag, N.R.: Error-resilient motion estimation architecture. IEEE Trans. Very Large Scale Integr. Syst. 16 (2008)Google Scholar
  19. 19.
    Ismail, Y.: A fast diamond motion estimation search algorithm for real time video applications. Int. J. Comput. Digit. Syst. Digit. Sys. 3(2), 101–110 (2014)CrossRefGoogle Scholar
  20. 20.
    Ismail, Y., McNeely, J., Shaaban, M., Bayoumi, M.A.: Fast motion estimation algorithm using dynamic models for H.264 video coding. IEEE Trans. Circuits Syst. Video Technol. 22(1), 28–42 (2012)CrossRefGoogle Scholar
  21. 21.
    Lee, M.H.: On computing 2-D systolic algorithm for discrete cosine transform. IEEE Trans. Circuits Syst. 37, 1321–1323 (1990)CrossRefGoogle Scholar
  22. 22.
    Chakrabarti, C., Jaja, J.: Systolic architectures for the computation of the discrete Hartley and the discrete cosine transforms based on prime factor decomposition. IEEE Trans. Comput. 39, 1359–1368 (1990)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Chang, L.W., Wu, M.C.: A unified systolic array for discrete cosine and sine transforms. IEEE Trans. Signal Process. 39, 192–194 (1991)CrossRefGoogle Scholar
  24. 24.
    Cheng, C., Parhi, K.K.: A novel systolic array structure for DCT. IEEE Trans. Circuits Syst. II Express Briefs 52(7), 366–369 (2005)CrossRefGoogle Scholar
  25. 25.
    Chiper, D.F., Swamy, M.N.S, Ahmad, O.: A new systolic array algorithm for a high throughput low cost VLSI implementation of DCT. In: 15th IEEE International Conference on Electronics, Circuits and Systems (ICECS), pp. 490–493 (2008)Google Scholar
  26. 26.
    Hsieh, H.: A fast recursive algorithm for computing the discrete cosine transform. IEEE Trans. Acoust. Speech Signal Process. 35, 1455–1461 (1987)CrossRefGoogle Scholar
  27. 27.
    An, S., Wang, C.: A recursive algorithm for 2-D DCT. IEEE Int. Symp. Signals Syst. Electron. 335–338 (2007)Google Scholar
  28. 28.
    Chien, Y.M., Lin, Y.: A recursive DCT algorithm with new distributed arithmetic. IEEE Int. Conf. Commun. Circuits Syst. Proc. 2582–2587 (2006)Google Scholar
  29. 29.
    Gu, J., Tao, L.: Block time-recursive algorithms for DCT-based real-valued discrete Gabor transform. Int. Colloq. Comput. Commun. Control Manag 70–74 (2008)Google Scholar
  30. 30.
    Liu, K.J.R., Chiu, C.T.: Unified parallel lattice structures for time-recursive discrete cosine/sine/Hartley transforms. IEEE Trans. Signal Process. 41, 1357–1377 (1993)CrossRefzbMATHGoogle Scholar
  31. 31.
    Andraka, R.: A survey of CORDIC algorithms for FPGA based computers. In: Proceedings of the ACM/SIGDA international Symp. Field Programmable Gate Arrays, pp. 191–200 (1998)Google Scholar
  32. 32.
    Chen, J., Liu, K.J.R.: A complete pipelined parallel CORDIC architecture for motion estimation. IEEE Trans. Circuits Syst. II Analog Digit. Signal Process. 45, 653–660 (1998)CrossRefGoogle Scholar
  33. 33.
    Chiu, C.T., Liu, K.J.R.: Real-time parallel and fully pipelined two-dimensional DCT lattice structures with application to HDTV systems. IEEE Trans. Circuits Syst. Video Technol. 2, 25–37 (1992)CrossRefGoogle Scholar
  34. 34.
    Liang, T., Hong-Wei, L.: DCT-based real discrete Gabor transform implemented by unified parallel lattice structures. Int. Conf. Wavel. Anal. Pattern Recogn. 1674–1678 (2007)Google Scholar
  35. 35.
    Hu, Y.H.: CORDIC-based VLSI architectures for digital signal processing. IEEE Signal Process. Mag. 9, 16–35 (1992)CrossRefGoogle Scholar
  36. 36.
    Chiu, C.T., Liu, K.J.R.: Real-time recursive two-dimensional DCT for HDTV systems. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP-92, 23–26 Mar, San Francisco, CA, vol. 3, pp. 205–208 (1992)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Yasser Ismail
    • 1
    • 4
    Email author
  • Ahmed Abdelgawad
    • 2
  • Sherif El-etriby
    • 3
    • 5
  1. 1.University of BahrainSakhairKingdom of Bahrain
  2. 2.Central Michigan UniversityMount PleasantUSA
  3. 3.Umm Al-Qura UniversityMeccaKingdom of Saudi Arabia
  4. 4.Electronics and Communications Engineering Department, Faculty of EngineeringMansoura UniversityMansouraEgypt
  5. 5.Computer Science Department, Faculty of Computers and InformationMenoufia UniversityShibin Al KawmEgypt

Personalised recommendations