Abstract
In this work a configurable and scalable vector coprocessor for real time processing of MPEG4 motion estimation (ME) and two-dimensional DCT (2D DCT) is presented. A sequential DSP processor based on a reduced instruction set computer (RISC) processor architecture would require a frequency of 15 GHz for the real time processing of these two processes for a common intermediate format (CIF) sized sequence at 25 frames per second (fps). This frequency requirement will increase further if the image dimensions are increased. On the other hand our architecture on FPGA can achieve the real time processing rate at low frequency for CIF sized sequence and at higher frequency for full high definition (FHD) sequence for combined ME and 2D DCT. Due to configurable nature of the architecture and FPGA, this can be extended to higher dimensional image sequences. An important aspect of the architecture is that same datapath that is used for ME is also used for 2D DCT, with minor modification, leading to saving in area and time consumption. In addition the processor–coprocessor architecture has lower energy consumption and cost than the sequential processor.
Similar content being viewed by others
References
Kuhn, P.: Algorithms, complexity analysis and VLSI architectures for MPEG4 motion estimation. Kluwer Academic Publishers, Dordrecht (2003)
Khemiri, R., Kibeya, H., Loukil, H., Sayadi, F.E., Atri, M., Masmoudi, N.: Real-time motion estimation diamond search algorithm for the new high efficiency video coding on FPGA. Analog Integr. Circ. Signal Process. 94, 259 (2018)
Elhamzi, W., Dubois, J., Miteran, J., Atri, M.: An efficient low-cost FPGA implementation of a configurable motion estimation for H.264 video coding. J. Real Time Image Proc. 9, 19 (2014)
Sanchez, G., Zatt, B., Porto, M., Agostini, L.: Hardware-friendly HEVC motion estimation: new algorithms and efficient VLSI designs targeting high definition videos. Analog Integr. Circ. Signal Process. 82, 135 (2015)
Ndili, O., Ogunfunmi, T.: Algorithm and architecture co-design of hardware-oriented, modified diamond search for fast motion estimation in H.264/AVC. IEEE Trans. Circ. Syst. Video Technol. 21, 1214 (2011)
Urban, F., Deforges, O., Nezan, J.F.: In: SPIE Optics + Photonics, p. 849917. San Diego, United States (2012)
Urban, F., Nezan, J.F., Raulet, M.: HDS, a real-time multi-DSP motion estimator for MPEG-4 H.264 AVC high definition video encoding. J. Real Time Image Proc. 4, 23 (2009)
Li, D.X., Zheng, W., Zhang, M.: Architecture design for H.264/AVC integer motion estimation with minimum memory bandwidth. IEEE Trans. Consum. Electron. 53, 1053 (2007)
Fukazawa, Y., Watanabe, K., Minoura, Y., Kondo, T., Sasaki, T.: In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1031–1035. Shanghai, China (2016)
Huayi, L., MaLini, L. Hai: In: 2010 Second International Conference on Communication Systems, Networks and Applications, pp. 239–244. Hong Kong, China (2010)
Sayed, M.: In: 16th IEEE International Conference on Electronics, Circuits and Systems, pp. 787–790. Yasmine Hammamet, Tunisia (2009)
Babionitakis, K., Doumenis, G.A., Georgakarakos, G., Lentaris, G., Nakos, K., Reisis, D., et al.: A real-time motion estimation FPGA architecture. J. Real Time Image Process. 3, 3 (2008)
Chouliaras, V.A., Dwyer, V.M., Agha, S.: In: Proceedings of ACIVS 2005, p. 2005. Antwer, Belgium (2005)
Chouliaras, V.A., Agha, S., Jacobs, T.R., Dwyer, V.M.: Quantifying the benefit of thread and data parallelism for fast motion estimation in MPEG-2. IEE Electron. Lett. 42, 747 (2006)
Chouliaras, V.A., Nunez-Yanez, J.L., Agha, S.: In: Proceedings IASTED (SIP), pp. 298–303. Honolulu, Hawaii (2004)
Chouliaras, V.A., Dwyer, V.M., Agha, S., Nunez-Yanez, J.L., Reisis, D., Nakos, K., et al.: Customization of an embedded RISC CPU with SIMD extensions for video encoding. A case study. Integr. VLSI J. 41, 135 (2008)
Chouliaras, V.A., Nunez, J.L., Mulvaney, D.J., Rovati, F., Alfonso, D.: A multi-standard video coding accelerator based on a vector architecture. IEEE Trans. Consum. Electron. 5, 160 (2005)
Chen, Y.H., Chen, T.C., Tsai, C.Y., Tsai, S.F., Chen, L.G.: Data reuse exploration for low power motion estimation architecture design in H.264 encoder. J. Signal Process. Syst. 50, 1 (2008)
Dwyer, V.M., Agha, S., Chouliaras, V.: In: Proceedings of Mirage 2005 Conference, pp. 191–196. Versailles, France (2005)
Dwyer, V.M., Agha, S., Chouliaras, V.A.: in Proceedings of ACIVS 2005 Conference, pp. 372–380. Antwerp, Belgium (2005)
Agha, S., Dwyer, V.M., Chouliaras, V.: Motion estimation with low resolution distortion metric. IEE Electron. Lett. 41, 693 (2005)
Agha, S., Khan, S.A., Malik, S., Riaz, R.A.: Reduced bit low power VLSI architectures for motion estimation. J. Syst. Eng. Electron. 24, 382 (2013)
Agha, S., Jan, F., Sabir, D., Saleem, K., Gulzari, U., Shakeel, A.: In: proceedings of the 2017 IEEE International Conference on Signal and Image Processing Applications. Malaysia, Kuching (2017)
Fakhari, A., Fathy, M.: In: Proceedings of International Conference on Reconfigurable Computing, pp. 115–120. Quintana Roo, Mexico (2010)
Dogan, A.: In: 9th International Conference on Electrical and Electronics Engineering (ELECO), pp. 771–775. Bursa, Turkey (2015)
Sjovall, P., Viitamaki, V., Vanne, J., Hamalainen, T.D.: In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1547–1551. New Orleans, LA, USA (2017)
Jeong, H., Kim, J., Cho, W.K.: Low-power multiplierless DCT architecture using image correlation. IEEE Trans. Consum. Electron. 50, 262 (2004)
Ghosh, S., Venigalla, S., Bayoumi, M.: In: Proceedings of the IEEE Computer Society Annual Symposium on VLSI New Frontiers in VLSI Design, pp. 162–166. Tampa, FL (2005)
Lee, M.W., Yoon, J.H., Park, J.: Reconfigurable CORDIC-based low-power DCT architecture based on data priority. IEEE Trans. Very Large Scale Integr. VLSI Syst. 22, 1060 (2013)
Mpeg compression standard. mpeg.chiariglione.org
Parhi, K.K.: VLSI Digit. Signal Process. Syst. Des. Implement. Wiley, Noida (2008)
Sparc v8 processor architecture. http://www.gaisler.com/index.php/products/processors
Mano, M.M.: Computer System Architecture. Dorling Kindersley, Noida (2008)
Holub, A.I.: Compiler Design in C. Prentice Hall, University of Michigan, Upper Saddle River (1990)
Matlab. www.mathworks.com
Xilinx synthesis tools. www.xilinx.com
Mentor graphics. www.mentor.com
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A
Pseudocode for Vectorized 2D DCT
Appendix B
Pseudocode for Vectorized ME
Rights and permissions
About this article
Cite this article
Agha, S., Gulzari, U.A., Shaheen, F. et al. A high throughput two-dimensional discrete cosine transform and MPEG4 motion estimation using vector coprocessor. J Real-Time Image Proc 17, 1319–1330 (2020). https://doi.org/10.1007/s11554-019-00892-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-019-00892-9