System-on-Chip Architectures for Data Analytics

  • Gwo Giun (Chris) LeeEmail author
  • Chun-Fu Chen
  • Tai-Ping Wang


Artificial Intelligence (AI) in Industry 4.0, intelligent transportation system, intelligent biomedical systems and healthcare, etc., plays an important role requiring complex algorithms. Deep learning in machine learning, for example, is a popular AI algorithm with high computational demands on EDGE platforms in Internet-of-Things applications. This chapter introduces the Algorithm/Architecture Co-Design system design methodology for concurrent design of an algorithm with highly efficient, flexible and low power architecture in constituting the Smart System-on-Chip design.


  1. 1.
    Amdahl, G.M.: Validity of single-processor approach to achieving large-scale computing capability. In: Proceedings of AFIPS Conference, pp. 483–485. Atlantic, New Jersey (1967)Google Scholar
  2. 2.
    Booth, A.D.: Signed binary multiplication technique. Quarterly Journal of Mechanics and Applied Mathematics 4(2), 236–240 (1951)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Carron, L., Rutzig, M.B.: Multi-core system on chip. Handbook of Signal Processing Systems, 1st edition., Springer pp. 485–514 (2010)Google Scholar
  4. 4.
    Chen, J.W., Lin, C.C., Guo, J.I., Wang, J.S.: Low Complexity Architecture Design of AVC/H.264 Predictive Pixel Compensator for HDTV Application. In: Proc. ICASSP2006, vol. 3, pp. III–932–III–935 (2006)Google Scholar
  5. 5.
    Chrysafis, C., Ortega, A.: Line-based, reduced memory, wavelet image compression. IEEE Trans. on Image Processing 9(3), 378–389 (2000)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Chung, F.R.K.: Spectral graph theory. Regional Conferences Series in Mathematics (92) (1997)Google Scholar
  7. 7.
    Edwards, S., Lavagno, L., Lee, E.A., Sangiovanni-Vincentelli, A.: Design of embedded systems: Formal models, validation and synthesis. In: Proceedings of the IEEE, vol. 85, pp. 366–390 (1997)Google Scholar
  8. 8.
    Escuder, V., Duran, R., Rico, R.: Quantifying ILP by means of graph theory. In: Proceedings of the 2nd International Conference on Performance Evaluation Methodologies and Tools, pp. 317–322. San Francisco, California (2007)Google Scholar
  9. 9.
    Fiedler, M.: Algebraic connectivity of graphs. Czechoslovakia Mathematical Journal 23(2), 298–305 (1973)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Ha, S., Oh, H.: Decidable dataflow models for signal processing: synchronous dataflow and its extension. Handbook of Signal Processing Systems, 2nd edition., Springer pp. 1083–1110 (2013)Google Scholar
  11. 11.
    Horowitz, M., John, A., Kossentini, F., Hallapuro, A.: H.264/AVC baseline profile decoder complexity analysis. IEEE Transactions on Circuits and Systems for Video Technology 13(7), 704–716 (2003)CrossRefGoogle Scholar
  12. 12.
    Hu, A., Kung, S.Y.: Systolic Arrays. Handbook of Signal Processing Systems, 2nd edition., Springer pp. 1111–1144 (2013)Google Scholar
  13. 13.
    Huang, C.T., Tseng, P.C., Chen, L.G.: Analysis and VLSI architecture for 1-D and 2-D discrete wavelet transform. IEEE Trans. on Signal Processing. 53(4), 1575–1586 (2005)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Janneck, J.W., Miller, D., Parlour, D.B.: Profiling dataflow programs. In: Proceedings of IEEE ICME 2008, pp. 1065–1068 (2008)Google Scholar
  15. 15.
    Jiang, W., Ortega, A.: Lifting factorization-based discrete wavelet transform architecture design. IEEE Trans. on Circuits and Systems for Video Technology 11(5), 651–657 (2001)CrossRefGoogle Scholar
  16. 16.
    Kung, S.Y.: VLSI Array Processor. Upper Saddle River, New Jersey: Prentice-Hall (1988)Google Scholar
  17. 17.
    Lee, G.G., Chen, C.F., Hsiao, C.J., Wu, J.C.: Bi-Directional Trajectory Tracking With Variable Block-Size Motion Estimation for Frame Rate Up-Convertor. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 4, 29–42 (2014)CrossRefGoogle Scholar
  18. 18.
    Lee, G.G., Chen, Y.K., Mattavelli, M., Jang, E.S.: Algorithm/Architecture Co-Exploration of Visual Computing: Overview and Future Perspectives. IEEE Transactions on Circuits and Systems for Video Technology 19(11), 1576–1587 (2009)CrossRefGoogle Scholar
  19. 19.
    Lee, G.G., Lin, H.Y., Chen, C.F., Huang, T.Y.: Quantifying Intrinsic Parallelism Using Linear Algebra for Algorithm/Architecture Coexploration. IEEE Transactions on Parallel and Distributed Systems 23, 944–957 (2012)CrossRefGoogle Scholar
  20. 20.
    Lin, H.Y., Lee, G.G.: Quantifying Intrinsic parallelism via Eigen-decomposition of dataflow graphs for algorithm/architecture co-exploration. In: Proceedings of IEEE SiPS 2010 (2010)Google Scholar
  21. 21.
    Oppenheim, A.V., Schaefer, R.W.: Discrete-Time Signal Processing. Englewood Cliffs, NJ: Prentice-Hall (1989)Google Scholar
  22. 22.
    Parhi, K., Chen, Y.: Signal Flow Graphs and Data Flow Graphs. Handbook of Signal Processing Systems, 2nd edition., Springer pp. 1277–1302 (2013)Google Scholar
  23. 23.
    Parhi, K.K.: VLSI Digital Signal Processing Systems: Design and Implementation. New York: Wiley (1999)Google Scholar
  24. 24.
    Prihozhy, A., Mattavelli, M., Mlynek, D.: Evaluation of the parallelization potential for efficient multimedia implementations: dynamic evaluation of algorithm critical path. IEEE Transactions on Circuits and Systems for Video Technology 15(5), 593–608 (2005)CrossRefGoogle Scholar
  25. 25.
    Ragan-Kelley, J., Barnes, C., Adams, A., Paris, S., Durand, F., Amarasinghe, S.: Halide, a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. In: Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation. Seattle, Washington (2013)CrossRefGoogle Scholar
  26. 26.
    Ravasi, M., Mattavelli, M.: High-level algorithmic complexity evaluation for system design. Journal of Systems Architecture 48/1315, 403–427 (2003)CrossRefGoogle Scholar
  27. 27.
    Ravasi, M., Mattavelli, M.: High-abstraction level complexity analysis and memory architecture simulations of multimedia algorithms. IEEE Transactions on Circuits and Systems for Video Technology 15(5), 673–684 (2005)CrossRefGoogle Scholar
  28. 28.
    Sutter, B.D., Praveen, P., Lambrechts, A.: Coarse-grain reconfigurable array architectures. Handbook of Signal Processing Systems, 2nd edition., Springer pp. 553–592 (2013)Google Scholar
  29. 29.
    Takala, J.: General purpose DSP processors. Handbook of Signal Processing Systems, 2nd edition., Springer pp. 779–802 (2013)Google Scholar
  30. 30.
    Yamauchi, H., et al.: Image processor capable of block-noise-free JPEG2000 compression with 30 frames/s for digital camera applications. In: Proc. IEEE Int. Solid-State Circuits Conf., pp. 46–47 (2003)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2019

Authors and Affiliations

  • Gwo Giun (Chris) Lee
    • 1
    Email author
  • Chun-Fu Chen
    • 1
  • Tai-Ping Wang
    • 2
  1. 1.Department of Electrical EngineeringNational Cheng Kung UniversityTainan CityTaiwan
  2. 2.IBM T.J. Watson Research CenterYorktown HeightsUSA

Personalised recommendations