Abstract
Traditional uni-core processors have met tremendous challenges to improve their performance and energy efficiency, and to adapt to the deep submicron fabrication technology. Meanwhile, traditional ASIC implementations are also widely prohibited due to their inherent inflexibility and high design cost. On the other hand, rapidly advancing fabrication technologies have enabled the integration of many processors into a single chip, called multi-core processors, and promise a platform with high performance, high energy efficiency, and high flexibility.
This chapter will discuss the motivations of shifting from traditional IC systems (including uni-core processors and ASIC implementations) to multi-core processors, investigate the design cases of multi-core processors and their key features, and look forward to the future work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Reference
M.S. Hrishikesh, N.P. Jouppi, K.I. Farkas, D. Burger, S.W. Keckler, P. Shivakumar, The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays, in International Symposium on Computer Architecture (ISCA), May 2002, pp. 14–24
B. Flachs, S. Asano, S.H. Dhong, P. Hofstee, G. Gervais, R. Kim, T. Le, P. Liu, J. Liberty, B. Michael, H. Oh, S.M. Mueller, O. Takahashi, A. Hatakeyama, Y. Watanabe, N. Yano, A streaming processing unit for a CELL processor, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2005, pp. 134–135
A. Harstein, T.R. Puzak, Optimum power/performance pipeline depth, in IEEE International Symposium on Microarchitecture (MICRO), Dec 2003, pp. 117–125
G.E. Moore, Cramming more components onto integrated circuits. Electronics 38(8), 114–117 (Apr 1965)
S. Agarwala, T. Anderson, A. Hill, M.D. Ales, R. Damodaran, P. Wiley, S. Mullinnix, J. Leach, A. Lell, M. Gill, A. Rajagopal, A. Chachad, M. Agarwala, J. Apostol, M. Krishnan, D. Bui, Q. An, N.S. Nagaraj, T. Wolf, T.T. Elappuparackal, A 600-MHz VLIW DSP. IEEE J. Solid State Circuits (JSSC) 37(11), 1532–1544 (Nov 2002)
R.P. Preston, R.W. Badeau, D.W. Balley, S.L. Bell, L.L. Biro, W.J. Bowhill, D.E. Dever, S. Felix, R. Gammack, V. Germini, M.K. Gowan, P. Gronowshi, D.B. Jankson adn S. Mehta, S.V. Morton, J.D. Pickholtz, M.H. Reilly, M.J. Smith, Design of an 8-wide superscalar RISC microprocessor with simultaneous multithreading, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2002, pp. 266–267
J.L. Hennessy, D. Patterson, Computer Architecture – A Quantitative Approach, 4th edn. (Morgan Kaufmann Publisher, 2007)
K. Roy, S. Mukhopadyay, H. Mahmoodi-meimand, Leakage current mechanisms and leakage reduction techniques in deep-submicrometer CMOS circuits. Proc. IEEE 91(2), 305–327 (Feb 2003)
M. Horowitz, W. Dally, How scaling will change processor architecture, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2004, pp. 132–133
S. Borkar, Low power design challenges for the decade, in Asia and South Pacific Design Automatic Conference (ASP-DAC), 2001, pp. 293–296
J. Stinson, S. Rusu, A 1.5 GHz third generation Itanium processor, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2003, pp. 252–253
S. Naffziger, T. Grutkowski, B. Stackhouse, The implementation of a 2-core multi-threaded Itanium family processor, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2005, pp. 182–183, 592
S. Rusu, S. Tam, H. Muljono, D. Ayers, J. Chang, A dual-core multi-threaded Xeon processor with 16MB L3 cache, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2006, pp. 102–103
H.D. Man, Ambient intelligence: Gigascale dreams and nanoscale realities, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2004, pp. 29–35
R. Ho, K.W. Mai, M.A. Horowitz, The future of wires. Proc. IEEE 89(4), 490–504 (Apr 2001)
International Roadmap Committee, International technology roadmap for semiconductors, 2005 edn. Technical report, ITRS, 2005. http://public.itrs.net/
S. Borkar, T. Karnik, S. Narendra, J. Tschanz, A. Keshavarzi, V. De, Parameter variations and impact on circuits and microarchitecture, in IEEE International Conference on Design Automation (DAC), June 2003, pp. 338–342
S. Kaneko, K. Sawai, N. Masui, et al., A 600 MHz single-chip multiprocessor with 4.8GB/s internal shared pipelined bus and 512kB internal memory, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2003, pp. 254–255
J. Hart, S. Choe, L. Cheng, C. Chou, A. Dixit, K. Ho, J. Hsu, K. Lee, J.Wu, Implementation of a 4th-generation 1.8GHz dual-core SPARC v9 microprocessor, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2005, pp. 186–187
A. Bright, M. Ellavsky, A. Gara, R. Haring, G. Kopcsay, R. Lembach, J. Marcella, M. Ohmacht, V. Salapura, Greating the BlueGene/L supercomputer from lowpower SoC AISCs, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2005, pp. 188–189
M.B. Taylor, J. Kim, J. Miller, D. Wentzlaff, F. Ghodrat, B. Greenwald, H. Hoffman, P. Johnson, W. Lee, A. Saraf, N. Shnidman, V. Strumpen, S. Amarasinghe, A. Agarwal, A 16-issue multiple-program-counter microprocessor with point-topoint scalar operand network, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2003, pp. 170–171
Z. Yu, M. Meeuwsen, R. Apperson, O. Sattari, M. Lai, J. Webb, E. Work, T. Mohsenin, M. Singh, B. Baas, An asynchronous array of simple processors for DSP applications, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2006, pp. 428–429
S. Vangal, J. Howard, G. Ruhl, S. Dighe, H. Wilson, J. Tschanz, D. Finan, P. Lyer, A. Singh, T. Jacb, S. Jain, S. Venkataraman, Y. Hoskote, N. Borkar, An 80-tile 1.28 TFLOPS network-on-chip in 65nm CMOS, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2007, pp. 98–99
K. Asanovic, R. Bodik, B.C. Catanzaro, J.J. Gebis, P. Husbands, K. Keutzer, D.A. Patterson, W.L. Plishker, J. Shalf, S.W. Williams, K.A. Yelick, The landscape of parallel computing research: A view from berkeley, Technical Report UCB/EECS-2006-183, University of California, Berkeley, Dec 2006
W.A. Wulf, C.G. Bell, C.mmp – a multi-mini-processor, in AFIPS Conference, 1972, pp. 765–777
D. Lenoshi, J. Laudon, K. Gharachorloo, W.D. Weber, A. Gupta, J. Hennessy, M. Horowitz, M.S. Lam, The stanford DASH multiprocessor. IEEE Comp. 25(3), 63–79 (Mar 1992)
C.L. Seitz, The cosmic cube. Commun. ACM 28(1), 22–33 (Jan 1985)
J. Kuskin, D. Ofelt, M. Heinrich, J. Heinlein, R. Simoni, K. Gharachorloo, J. Chapin, D. Nakahira, J. Baxter, M. Horowitz, A. Gupta, M. Rosenblum, J. Hennessy, The Stanford FLASH multiprocessor, in International Symposium on Computer Architecture (ISCA), Apr 1994, pp. 302–313
D.H. Lawrie, Access and alignment of data in an array processor. IEEE Trans. Comput. 24(12), 1145–1155 (Dec 1975)
H.S. Stone, Parallel processing with the perfect shuffle. IEEE Trans. Comput. 2, 153–161 (Feb 1971)
C. Whitby-Strevens, Transputers-past, present and future. IEEE Micro 10(6), 16–19 (Dec 1990)
H. T. Kung. “Why systolic architectures?” Computer Magazine, 15(1), January 1982.
H.T. Kung, Systolic communication, in International Conference on Systolic Arrays, May 1988, pp. 695–703
L. Snyder, Introduction to the configurable, highly parallel computer. IEEE Comput. 15(1), 47–56 (Jan 1982)
S.Y. Kung, K.S. Arun, R.J. Gal-Ezer, D.V. Bhaskar Rao, Wavefront array processor: Language, architecture, and applications. IEEE Trans. Comput. 31(11), 1054–1066 (Nov 1982)
S.Y. Kung, VLSI array processors. IEEE ASSP Mag. 2(3), 4–22 (July 1985)
U. Schmidt, S. Mehrgardt, Wavefront array processor for video applications, in IEEE International Conference on Computer Design (ICCD), Sept 1990, pp. 307–310
A. Keung, J.M. Rabaey, A 2.4 GOPS data-driven reconfigurable multiprocessor IC for DSP, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 1995, pp. 108–110
E. Waingold, M. Taylor, D. Srikrishna, V. Sarkar, W. Lee, V. Lee, J. Kim, M. Frank, P. Finch, R. Barua, J. Babb, S. Amarasinghe, A. Agarwal, Baring it all to software: Raw machines. IEEE Comput. 30(9), 86–93 (Sept 1997)
S. Rixner, W.J. Dally, U.J. Kapasi, B. Khailany, A. Lopez-Laguns, P. Mattson, J.D. Owens, A bandwidth-efficient architecture for media processing, in IEEE international Symposium on Microarchitecture (MICRO), Nov 1998, pp. 3–13
B. Khailany, W.J. Dally, A. Chang, U.J. Kapasi, J. Namkoong, B. Towles, VLSI design and verification of the imagine processor, in IEEE International Conference on Computer Design (ICCD), Sept 2002, pp. 289–294
B. Khailany, T. Williams, J. Lin, E. Long, M. Rygh, D. Tovey, W.J. Dally, A programmable 512 GOPS stream processor for signal, image, and video processing, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2007, pp. 272–273
L. Hammond, B. Hubbert, M. Siu, M. Prabhu, M. Chen, K. Olukotun, The stanford Hydra CMP. IEEE Micro 20(2), 71–84 (March 2000)
A. Leon, J. Shin, K. Tam, W. Bryg, F. Schumachier, P. Kongetira, D. Weisner, A. Strong, A power-efficienct high-throughput 32-thread SPARC processor, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2006, pp. 98–99
U.G. Nawathe, N. Hassan, L. Warriner, K. Yen, B. Upputuri, D. Greenhill, A. Kumar, H. Park, An 8-core 64-thread 64b power-efficient SPARC SoC, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2007, pp. 108–109
H. Zhang, V. Prabhu, V. George, M. Wan, M. Benes, A. Abnous, J.M. Rabaey, A 1-V heterogeneous reconfigurable DSP IC for wireless baseband digital signal processing. IEEE J. Solid State Circuits (JSSC) 35(11), 1697–1704 (Nov 2000)
K. Mai, T. Paaske, N. Jayasena, R. Ho, W.J. Dally, M. Horowitz, Smart memories: A modular reconfigurable architecture, in International Symposium on Computer Architecture (ISCA), June 2000, pp. 161–171
K. Mai, R. Ho, E. Alon, D. Liu, Y. Kim, D. Patil, M. Horowitz, Architecture and circuit techniques for a reconfigurable memory block, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2004, pp. 500–501
K. Sankaralingam, R. Nagarajan, H. Liu, J. Huh, C.K. Kim, D. Burger, S.W. Keckler, C.R. Moore, Exploiting ILP, TLP, and DLP using polymorphism in the TRIPS architecture, in International Symposium on Computer Architecture (ISCA), Feb 2003, pp. 422–433
M. Saravana, S. Govindan, D. Burger, S. Keckler, TRIPS: A distributed explicit data graph execution (EDGE) microprocessor, in Hotchips, August 2007
H. Schmit, D. Whelihan, M. Moe, B. Levine, R.R. Taylor, PipeRench: A virtualized programmable datapath in 0.18 micron technology, in IEEE Custom Integrated Circuits Conference (CICC), May 2002, pp. 63–66
S. Swanson, K. Michelson, A. Schwerin, M. Oskin, Wavescalar, in IEEE international Symposium on Microarchitecture (MICRO), Dec 2003, pp. 291–302
S. Swanson, A. Putnam, M. Mercaldi, K. Michelson, A. Petersen, A. Schwerin, M. Oskin, S.J. Eggers, Area-performance trade-offs in tiled dataflow architectures, in International Symposium on Computer Architecture (ISCA), May 2006, pp. 314–326
D. Truong, W. Cheng, T. Mohsenin, Z. Yu, T. Jacobson, G. Landge, M. Meeuwsen, C. Watnik, A. Tran, Z. Xiao, E. Work, J. Webb, P. Mejia, B. Baas, A 167-processor Computational Platform in 65 nm CMOS. IEEE J. Solid State Circuits (JSSC) 44(4), 1130–1144 (April 2009)
J. Oliver, R. Rao, P. Sultana, J. Crandall, E. Czernikowski, L.W. Jones, D. Franklin, V. Akella, F.T. Chong, Synchroscalar: A multiple clock domain, power-aware, tile-based embedded processor, in International Symposium on Computer Architecture (ISCA), June 2004, pp. 150–161
D.C. Cronquist, P. Franklin, C. Fisher, M. Figueroa, C. Ebeling, Architecture design of reconfigurable pipelined datapaths, in Conference on Advanced Research in VLSI, March 1999, pp. 23–40
R. Baines, D. Pulley, A total cost approach to evaluating different reconfigurable architectures for baseband processing in wireless receivers. IEEE Commun. Mag. 41(1), 105–113 (Jan 2003)
S. Kyo, T. Koga, S. Okazaki, R. Uchida, S. Yoshimoto, I. Kuroda, A 51.2GOPS scalable video recognition processor for intelligent cruise contol based on a linear array of 128 4-way VLIW processing elements, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2003, pp. 48–49
J. Carlstrom, G. Nordmark, J. Roos, T. Boden, L. Svensson, P. Westlund, A 40Gb/s network processor with PISC dataflow architecture, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2004, pp. 60–61
W. Eatherton, The push of network processing to the top of the pyramid, in Symposium on Architectures for Networking and communications systems, Oct 2005
D. Pham, S. Asano, M. Bolliger, M.N. Day, H.P. Hofstee, C. Johns, J. Kahle, A. Kameyama, J. Keaty, Y. Masubuchi, M. Riley, D. Shippy, D. Stasiak, M. Suzuoki, M. Wang, J. Warnock, S. Weitzel, D. Wendel, T. Yamazaki, K. Yazawa, The design and implementation of a first-generation CELL processor, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2005, pp. 184–185
Intellasys, SEAforth-24B, embedded array processor, Technical report. http://www.intellasys.net/
Mathstar, Arrix family product brief, Technical report. http://www.mathstar.com/
Rapport, KC256 technical overview, Technical report. http://www.rapportincorporated.com/
A.M. Jones, M. Butts, TeraOPS hardware: A new massively-parallel MIMD computing fabric IC, in Hotchips, Aug 2006
D. Lattard, E. Beigne, C. Bernard, C. Bour, F. Clermidy, Y. Durand, J. Durupt, D. Varreau, P. Vivit, P. Penard, A. Bouttier, F. Berens, A telecom baseband circuit based on an asynchronous network-on-chip, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2007, pp. 258–259
V. Yalala, D. Brasili, D. Carlson, A. Hughes, A. Jain, T. Kiszely, K. Kodandapani, A. Varadhrajan, T. Xanthopoulos, A 16-core RISC microprocessor with network extensions, in IEEE International Solid-State Circuits Conference (ISSCC), Feb 2006, pp. 100–101
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media B.V.
About this chapter
Cite this chapter
Yu, Z. (2010). Towards High-Performance and Energy-Efficient Multi-core Processors. In: Iniewski, K. (eds) CMOS Processors and Memories. Analog Circuits and Signal Processing. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-9216-8_2
Download citation
DOI: https://doi.org/10.1007/978-90-481-9216-8_2
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-9215-1
Online ISBN: 978-90-481-9216-8
eBook Packages: EngineeringEngineering (R0)