Skip to main content

Synchroscalar: Initial Lessons in Power-Aware Design of a Tile-Based Embedded Architecture

  • Conference paper
Power-Aware Computer Systems (PACS 2003)

Abstract

Embedded devices have hard performance targets and severe power and area constraints that depart significantly from our design intuitions derived from general-purpose microprocessor design. This paper describes our initial experiences in designing Synchroscalar, a tile-based embedded architecture targeted for multi-rate signal processing applications.

We present a preliminary design of the Synchroscalar architecture and some design space exploration in the context of important signal processing kernels. In particular, we find that synchronous design and substantial global interconnect are desirable in the low-frequency, low-power domain. This global interconnect enables parallelization and reduces processor idle time, which are critical to energy efficient implementations of high bandwidth signal processing. Furthermore, statically-scheduled communication and SIMD computation keep control overheads low and energy efficiency high.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baas, B.M.: A parallel programmable energy-efficient architecture for computationally intensive DSP systems. In: Conference Record of the Thirty-Seventh Asilomar Conference on Signals, Systems, and Computers (November 2003)

    Google Scholar 

  2. Bhattacharya, S., Murthy, P., Lee, E.: Software synthesis from dataflow graphs (1996)

    Google Scholar 

  3. Bhattacharya, S., Murthy, P., Lee, E.: Synthesis of embedded software from synchronous dataflow specifications. Journal of VLSI Signal Processing (21), 151–166 (1999)

    Google Scholar 

  4. Buck, J., Ha, S., Lee, E.A., Messerschmitt, D.G.: Ptolemy: A framework for simulating and prototyping heterogenous systems. Int. Journal in Computer Simulation 4(2) (1994)

    Google Scholar 

  5. Caspi, E., Chu, M., Huang, R., Yeh, J., Wawrzynek, J., DeHon, A.: Stream computations organized for reconfigurable execution (SCORE). In: Grünbacher, H., Hartenstein, R.W. (eds.) FPL 2000. LNCS, vol. 1896, pp. 605–614. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  6. Ho, R., Mai, K., Horowitz, M.: The future of wires. Proceedings of the IEEE 89, 490–504 (2001)

    Article  Google Scholar 

  7. Kolagotla, R., Fridman, J., Aldrich, B., Hoffman, M., Anderson, W., Allen, M., Witt, D., Dunton, R., Booth, L.: High Performance Dual-MAC DSP Architecture. IEEE Signal Processing Magazine (July 2002)

    Google Scholar 

  8. Lee, E.A., Messerschmitt, D.G.: Static scheduling of synchronous dataflow programs for digital signal processing. IEEE Transactions on Computers C-36(1) (January 1999)

    Google Scholar 

  9. Lee, W., Barua, R., Frank, M., Srikrishna, D., Babb, J., Sarkar, V., Amarasinghe, S.P.: Space-time scheduling of instruction-level parallelism on a raw machine. In: Architectural Support for Programming Languages and Operating Systems, pp. 46–57 (1998)

    Google Scholar 

  10. Liang, J., Swaminathan, S., Tessier, R.: aSOC: A scalable, single-chip communications architecture. In: IEEE PACT, pp. 37–46 (2000)

    Google Scholar 

  11. Marculescu, D., Iyer, A.: Power and performance evaluation of globally asynchronous locally synchronous processors. In: DeGroot, D. (ed.) Proceedings of the 29th International Symposium on Computer Architecture (ISCA 2002), May 25–29. Computer Architectuer News, vol. 30(2), pp. 158–170. ACM Press, New York (2002)

    Google Scholar 

  12. Mori, T., Amrutur, B., Horowitz, M., Fukushi, I., Izawa, T., Mitarai, S.: A 1v 0.19mw at 100 mhz 2kx16b sram utilizing a half-swing pulsed-decoder and write-bus architecture in 0.25 μm dual-vt cmos. In: Solid-State Conference, 1998, Digest of Technical Papers, 45th ISSCC 1998 IEEE International (1998)

    Google Scholar 

  13. Rixner, S., Dally, W.J., Kapasi, U.J., Khailany, B., Lopez-Lagunas, A., Mattson, P.R., Owens, J.D.: A bandwidth-efficient architecture for media processing. In: International Symposium on Microarchitecture, pp. 3–13 (1998)

    Google Scholar 

  14. Sakurai, T., Newton, R.: Alpha-Power Law MOSFET Model and Its Application to CMOS Inverter Delay and Other Formulas. IEEE Journal of Solid State Circuits 25, 584–594 (1990)

    Article  Google Scholar 

  15. Sarmenta, L., Pratt, G.A., Ward, S.: Rational clocking. In: International Conference on Computer Design, pp. 271–278 (1995)

    Google Scholar 

  16. Schmit, H., Cadambi, S., Moe, M., Goldstein, S.: Pipeline reconfigurable FPGA. Journal of VLSI Signal Processing Systems for Signal, Image and Video Technology 24(2), 129–146 (2000)

    Article  Google Scholar 

  17. Semeraro, G., Magklis, G., Balasubramonian, R., Albonesi, D.H., Dwarkadas, S., Scott, M.L.: Energy-efficient processor design using multiple clock domains with dynamic voltage and frequency scaling. In: HPCA, pp. 29–42 (2002)

    Google Scholar 

  18. Shoemaker, D., Honore, F., Metcalf, C., Ward, S.: Numesh: An architecture optimized for scheduled communication. Journal of Supercomputing 10(3) (1996)

    Google Scholar 

  19. Kumura, M.Y.T., Ikekawa, M., Kuroda, I.: VLIW DSP for Mobile Applications. In: IEEE Signal Processing Magazine (July 2002)

    Google Scholar 

  20. Taylor, M.B., Kim, J., Miller, J., Wentzlaff, D., Ghodrat, F., Greenwald, B., Hoffman, H., Johnson, P., Lee, J.-W., Lee, W., Ma, A., Saraf, A., Seneski, M., Shnidman, N., Strumpen, V., Frank, M., Amarasinghe, S., Agarwal, A.: The Raw microprocessor: A computational fabric for software circuits and general-purpose programs. IEEE Micro 22(2), 25–35 (2002)

    Article  Google Scholar 

  21. Zhang, H., Prabhu, V., George, V., Benes, M., Abnous, A., Rabaey, J.: A 1-V heterogenous reconfigurable DSP IC for wireless baseband digital signal processing. IEEE Journal of Solid State Circuits 35, 1697–1704 (2000)

    Article  Google Scholar 

  22. Zhang, H., Wan, M., George, V., Rabaey, J.: Interconnect Architecture Exploration for Low Energy Reconfigurable Single-Chip DSP. In: Proceedings of the Workshop on VLSI, Orlando, Florida (April 1999)

    Google Scholar 

  23. Zhang, Y., Ye, W., Irwin, M.J.: An alternative for on-chip global interconnect: Segmented bus power modeling. In: Conference Record of the Thirty-Second Asilomar Conrference on Signals, Systems and Computers, pp. 1062–1065 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Oliver, J. et al. (2005). Synchroscalar: Initial Lessons in Power-Aware Design of a Tile-Based Embedded Architecture. In: Falsafi, B., VijayKumar, T.N. (eds) Power-Aware Computer Systems. PACS 2003. Lecture Notes in Computer Science, vol 3164. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28641-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-28641-7_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-24031-0

  • Online ISBN: 978-3-540-28641-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics