Conclusion and Future Outlook

  • Muhammad Usman Karim Khan
  • Muhammad Shafique
  • Jörg Henkel


Targeting multimedia systems under high throughput, resource and power constraints, this book discusses efficient software-/application-level techniques and hardware-/architectural-level designs for the multimedia (specifically video) systems. Mainly, the aim of the techniques discussed in this book is to maximize the throughput-per-watt metric of the system while considering some modern design challenges and methodologies. The challenges addressed in this book include parallelization of multimedia applications on possibly heterogeneous systems, load balancing on many-core and customized nodes, resource (number of cores and power) budgeting, and efficient design of the multimedia system’s memory architecture. In a broader perspective, these problems can collectively represent the power wall or dark silicon challenge for the next-generation video processing systems.


  1. 1.
    Khan, M. U. K., Shafique, M., & Henkel, J. (2014). Software architecture of high efficiency video coding for many-core systems with power-efficient workload balancing. In Design, Automation and Test in Europe.Google Scholar
  2. 2.
    Shafique, M., Khan, M. U. K., & Henkel, J. (2014). Power efficient and workload balanced tiling for parallelized high efficiency video coding. In International Conference on Image Processing.Google Scholar
  3. 3.
    Jiang, W., Mal, H., & Chen, Y. (2012). Gradient based fast mode decision algorithm for intra prediction in HEVC. In International Conference on Consumer Electronics, Communications and Networks.Google Scholar
  4. 4.
    Sun, H., Zhou, D., & Goto, S. (2012). A low-complexity HEVC Intra prediction algorithm based on level and mode filtering,. In International Conference on Multimedia and Expo (ICME).Google Scholar
  5. 5.
    Rosas, C. Morajko, A. Jorba, J., & Cesar, E. (2011). Workload balancing methodology for data-intensive applications with divisible load. In Symposium on Computer Architecture and High Performance Computing.Google Scholar
  6. 6.
    Colin, A., Kandhalu, A., & Rajkumar, R. (2015). Energy-efficient allocation of real-time applications onto single-ISA heterogeneous multi-core processors. Journal of Signal Processing Systems, pp. 1–20.Google Scholar
  7. 7.
    Ma, K., Li, X., Chen, M., & Wang, X. (2011). Scalable power control for many-core architectures running multi-threaded applications. In Internation Symposium on Computer Architecture.Google Scholar
  8. 8.
    Fonseca, T. A., Liu, Y., & Queiroz, R. L. D. (2007). Open-loop prediction in H.264 / AVC for high definition sequences. In SBrT.Google Scholar
  9. 9.
    Kuo, H. C., Wu, L. C., Huang, H. T., Hsu, S. T., & Lin, Y. L. (2011). A low-power high-performance H.264/AVC intra-frame encoder for 1080pHD video. IEEE Transactions on Very Large Scale Integrated Systems (TVLSI), 19(6), 925–938.CrossRefGoogle Scholar
  10. 10.
    Chen, C., Huang, C., Chen, Y., & Chen, L. (2006). Level C+ data reuse scheme for motion estimation with corresponding coding orders. IEEE Transactions on Circuits and Systems for Video Technology, 16(4), 553–558.CrossRefGoogle Scholar
  11. 11.
    Shin, J., Zyuban, V., Bose, P., & Pinkston, T. (2008). A proactive wearout recovery approach for exploiting microarchi-tectural redundancy to extend cache SRAM lifetime. In International Symposium on Computer Architecture (ISCA).Google Scholar
  12. 12.
    Siddiqua, T., & Gurumurthi, S. (2010). Recovery boosting: A technique to enhance NBTI recovery in SRAM arrays. In Annual Symposium on VLSI.Google Scholar
  13. 13.
    Sil, A., Ghosh, S., Gogineni, N., & Bayoumi, M. (2008). A novel high write speed, low power, read-SNM-Free 6T SRAM cell. In Midwest Symposium on Circuits and Systems.Google Scholar
  14. 14.
    Shin, D., & Gupta, S. (2010). Approximate logic synthesis for error tolerant applications. In Design, Automation & Test in Europe Conference & Exhibition (DATE).Google Scholar
  15. 15.
    Venkataramani, S., Sabne, A., Kozhikkottu, V., Roy, K., & Raghunathan, A. (2012). Salsa: Systematic logic synthesis of approximate circuits. In Design Automation Conference (DAC).Google Scholar
  16. 16.
    Venkataramani, S., Roy, K., & Raghunathan, A. (2013). Substitute-and-simplify: A unified design paradigm for approximate and quality configurable circuits. In Design, Automation & Test in Europe Conference & Exhibition (DATE).Google Scholar
  17. 17.
    Ranjan, A., Raha, A., Venkataramani, S., Roy, K., & Raghunathan, A. (2014). ASLAN: Synthesis of approximate sequential circuits. In Design, Automation and Test in Europe Conference and Exhibition (DATE).Google Scholar
  18. 18.
    Chakrapani, L. N., Muntimadugu, K. K., Lingamneni, A., George, J., & Palem, K. V. (2008). Highly energy and performance efficient embedded computing through approximately correct arithmetic: A mathematical foundation and preliminary experimental validation. In international conference on Compilers, architectures and synthesis for embedded systems.Google Scholar
  19. 19.
    Gupta, V., Mohapatra, D., Park, S. P., Raghunathan, A., & Roy, K. (2011). IMPrecise adders for low-power approximate computing. In International Symposium on Low Power Electronics and Design (ISLPED).Google Scholar
  20. 20.
    Gupta, V., Mohapatra, D., Raghunathan, A., & Roy, K. (2012). Low-power digital signal processing using approximate adders. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 23(1), 124–127.Google Scholar
  21. 21.
    Khang, A. B., & Kang, S. (2012). Accuracy-configurable adder for approximate arithmetic designs. In Design Automation Conference (DAC).Google Scholar
  22. 22.
    Kulkarni, P., Gupta, P., & Ercegovac, M. (2011). Trading accuracy for power with an under-designed multiplier architecture. In International Conference on VLSI design.Google Scholar
  23. 23.
    Momcilovic, S., Ilic, A., Roma, N., & Sousa, L. (2014). Dynamic load balancing for real-time video encoding on heterogeneous CPU+GPU systems. IEEE Transactions on Multimedia, 16(1), 108–121.CrossRefGoogle Scholar
  24. 24.
    Momcilovic, S., Roma, N., & Sousa, L. (2013). Exploiting task and data parallelism for advanced video coding on hybrid CPU + GPU platforms. Journal of Real-Time Image Processing, pp. 1–17.Google Scholar
  25. 25.
    Cuomo, S., Michele, P. D., & Piccialli, F. (2014). 3D data denoising via nonlocal means filter by using parallel GPU strategies. In Computational and Mathematical Methods in Medicine.Google Scholar
  26. 26.
    Mittal, S., & Vetter, J. S. (2015). A Survey of CPU-GPU Heterogeneous Computing Techniques. ACM Computing Surveys, 47(4), 1–35.CrossRefGoogle Scholar
  27. 27.
    OpenCL – The open standard for parallel programming of heterogeneous systems. Khronos, [Online]. Available: Accessed 12 Oct 2015.
  28. 28.
    Salehi, M., Tavana, M. K., Rehman, S., Florian Kriebel, M. S., Ejlali, A., & Henkel, J. (2015). DRVS: Power-efficient reliability management through dynamic redundancy and voltage scaling under variations. In International Symposium on Low Power Electronics and Design (ISLPED).Google Scholar
  29. 29.
    Muller, K., Schwarz, H., Marpe, D., Bartnik, C., Bosse, S., Brust, H., Hinz, T., Lakshman, H., Merkle, P., Rhee, F., Tech, G., Winken, M., & Wiegand, T. (2013). 3D high-efficiency video coding for multi-view video and depth data. EEE Transactions on Image Processing, 22(9), 3366–3378.MathSciNetCrossRefGoogle Scholar
  30. 30.
    Vetro, A., Wiegand, T., & Sullivan, G. (2011). Overview of the stereo and multiview video coding extensions of the H.264/MPEG-4 AVC standard. Proceedings of the IEEE, 99(4), 626–642.CrossRefGoogle Scholar
  31. 31.
    Schwarz, H., Marpe, D., & Wiegand, T. (2007). Overview of the scalable video coding extension of the H.264/AVC standard. IEEE Transactions on Circuits and Systems for Video Technology, 17(9), 1103–1120.CrossRefGoogle Scholar
  32. 32.
    Goyal, V. (2001). Multiple description coding: compression meets the network. IEEE Signal Processing Magazine, 18(5), 74–93.CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Muhammad Usman Karim Khan
    • 1
  • Muhammad Shafique
    • 2
  • Jörg Henkel
    • 3
  1. 1.IBM Deutschland Research & Development GmbHBöblingenGermany
  2. 2.Institute of Computer EngineeringVienna University of TechnologyViennaAustria
  3. 3.Department of Computer ScienceKarlsruhe Institute of TechnologyKarlsruheGermany

Personalised recommendations