Skip to main content
Log in

High Performance General-Purpose Microprocessors: Past and Future

  • Architechture
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

It can be observed from looking backward that processor architecture is improved through spirally shifting from simple to complex and from complex to simple. Nowadays we are facing another shifting from complex to simple, and new innovative architecture will emerge to utilize the continuously increasing transistor budgets. The growing importance of wire delays, changing workloads, power consumption, and design/verification complexity will drive the forthcoming era of Chip Multiprocessors (CMPs). Furthermore, typical CMP projects both from industries and from academics are investigated. Through going into depths for some primary theoretical and implementation problems of CMPs, the great challenges and opportunities to future CMPs are presented and discussed. Finally, the Godson series microprocessors designed in China are introduced.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Divid Patterson, John Hennessy. Computer Architecture: A Quantitative Approach (the third edition). Elsevier Science Pte Ltd. 2003.

  2. International Technology Roadmap for Semiconductors, 2005. http://public.itrs.net/.

  3. Matzke D. Will physical scalability sabotage performance gains? Computer, Sept. 1997, 30(9): 37–39.

    Article  Google Scholar 

  4. Hrishikesh M S et al. The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays. In Proc. the 29th ISCA, IEEE CS Press, Anchorage, Alaska, 2002, pp. 14–24.

  5. Kessler R. The Alpha 21264 microprocessor. IEEE Micro, March/April 1999, 19: 2436.

  6. Hinton G, Dave Sager, Mike Upton et al. The microarchitecture of the Pentium 4 processor. Intel Technology Journal, Q1, 2001.

  7. Huh J, Burger D, Keckler S. Exploring the design space of future CMPs. In the 10th PACT, San Francisco, September 2001, pp. 199–210.

  8. Doug Burger, James R Goodman. Billion-transistor architectures: There and back again. Computer, Mar 2004, 37(3): 22–28.

    Article  Google Scholar 

  9. Sohi G. Microprocessors — 10 Years Back, 10 Years Ahead. In Conference at the Occasion of Dagstuhl’s 10th Anniversary, Lecture Notes in Computer Science, Dagstuhl, 2001, pp. 209–218.

  10. David Brooks, Pradip Bose et al. Power-aware microarchitecture: Design and modeling challenges for next-generation microprocessors. IEEE Micro, 2000, 20(6): 26–44.

    Google Scholar 

  11. James Laudon. Performance/Watt: The new server focus. In dasCMP’05, 2005, 33(4): 5–13.

  12. Theo Ungerer, Borut Robi. A survey of processors with explicit multithreading. ACM Computer, Surv. 2003: pp. 29–63.

  13. Marr D, Binns F et al. Hyper-threading technology architecture and microarchitecture. Intel Technology Journal, Feb. 2002, 6(1): 4–15.

    Google Scholar 

  14. Tullsen D, Eggers S, Levy H. Simultaneous multithreading: Maximizing on-chip parallelism. In the 22nd ISCA, June 1995, pp. 392–403.

  15. Joel Tendler, Steve Dodson, Steve Fields et al. Power4 system microarchitecture. IBM Technical White Paper, October 2001.

  16. Kalla R, Sinharoy B, Tendler J. IBM POWER5 chip: A dual core multithreaded processor. In IEEE Micro, 2004, 24(2): 40–47.

  17. Barroso L et al. Piranha: A scalable architecture based on single-chip multiprocessing. In Int. Symp. Computer Architecture, 2000, pp. 165–175.

  18. Poonacha Kongetira, Aingaran K, Olukotun K. Niagara: A 32-way multithreaded SPARC processor. IEEE Micro, Mar./Apr., 2005, 25(2): 21–29.

  19. Intel Pentium D Processor. Advanced Micro Devices, AMD Demonstrates Dual Core Leadership, 2004, http://www.amd.com/.

  20. Intel Pentium D Processor. http://www.intel.com/products/processor/pentium_D/.

  21. Kevin Krewell. UltraSPARC IV mirrors predecessor: Sun builds dual-core chip in 130mm. Microprocessor Report, November 2003.

  22. http://www.amd.com/us-en/Processors/ProductInformation/0,30_118_8825,00.html.

  23. Spracklen L Abraham et al. Chip multithreading: Opportunities and challenges. In The 11h HPCA, Feb. 2005, pp. 248–252.

  24. McNairy C, Bhatia R. Montecito — The next product in the Itanium processor family. In Hot Chips, 16, 2004. http://www.hotchips.org/archive/.

  25. Kahle J A, Day M N et al. Introduction to the cell multiprocessor. IBM Journal of Research and Development, http://www.research.ibm.com/journal/rd/494/kahle.html.

  26. Sohi G S, Breach S E, Vijaykumar T N. Multiscalar processors. In the 22nd ISCA, New York, 1995, pp. 414–425.

  27. Sohi G S, Roth A. Speculative multithreaded processors. Computer, Apr. 2001, 34(4): 66–73.

    Google Scholar 

  28. Hall M W et al. Maximizing multiprocessor performance with the SUIF compiler. Computer, Dec. 1996, pp. 84–88.

  29. Lance Hammond, Ben Hubbert et al. The Stanford hydra CMP. IEEE MICRO Magazine, March 2000, 20(2): 71–84.

    Article  Google Scholar 

  30. Olukotun K, Nayfeh B A, Hammond L et al. The case for a single-chip multiprocessor. In Proc. ASPLOS-VII, October 1996, pp. 2–11.

  31. Zilles C. Master/slave speculative parallelization and approximate code [Dissertation]. Computer Sciences Department, University of Wisconsin-Madison, Aug. 2002.

  32. Rotenberg E, Jacobson Q et al. Trace processors. In the 30th Int. Symp. Microarchitecture, 1997, pp. 138–148.

  33. Masato Edahiro, Satoshi Matsushita et al. A single-chip multiprocessor for smart terminals. IEEE Micro, 2000, 24(4): 12–20.

  34. Marc Tremblay, Jeffrey Chan et al. The MAJC architecture: A synthesis of parallelism and scalability. IEEE Micro., 2000, 20(6): 12–25.

  35. Sankaralingam K et al. Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture. In the 30th ISCA, San Diego, USA, 2003, pp. 422–433.

  36. Burger D, Keckler S et al. Scaling to the end of silicon with EDGE architectures. IEEE Computer., July 2004, 37(7): 44–55.

    Google Scholar 

  37. Taylor M B, Lee W, Miller J et al. Evaluation of the raw microprocessor: An exposed-wire-delay architecture for ILP and streams. In the 31st ISCA, Germany, June 19–23, 2004, p.2.

  38. Chip Multiprocessors are here, but where are the threads? Panel at ISCA 2005.

  39. Rajwar R, Goodman J R. Speculative lock elision: Enabling highly concurrent multithreaded execution. In Proc. 34th Int. Symp. Microarchitecture, Austin, Texas, 2001, pp. 294–305.

  40. Rajwar R, Goodman J R. Transactional lock-free execution of lock-based programs. In the 10th ASPLOS, ACM Press, San Jose, California, 2002, pp. 5–16.

  41. Bala Vasanth, E Duesterwald et al. Dynamo: A transparent dynamic optimization system. In PLDI’00, Canada, 2000, pp. 1–12.

  42. Dehnert J C, Grant B K et al. The transmeta code morphing software: Using speculation, recovery, and adaptive retranslation to address real-life challenges. In Proc. the Int. Symp. Code Generation and Optimization, IEEE Computer Society, San Francisco, 2003, pp. 15–24.

  43. Jaehyuk Huh, Doug Burger et al. Speculative incoherent cache protocols. IEEE Micro, Nov. 2004, 24: 104–109.

    Article  Google Scholar 

  44. Lance Hammond, Vicky Wong et al. Transactional memory coherence and consistency. In the 31st ISCA, München, Germany, June 2004, p.102.

  45. Martin M M K, Hill M D, Wood D A. Token coherence: Decoupling performance and correctness. In the 30th ISCA, San Diego, USA, June 2003, pp. 182–193.

  46. Wulf W, McKee S. Hitting the memory wall: Implications of the obvious. ACM Computer Architecture News, March 1995, 23(1): 20–24.

    Article  Google Scholar 

  47. Goodman J, Burger D, Kagi A. Memory bandwidth limitations of future microprocessors. In Int. Symp. Computer Architecture, 1996, pp. 78–89.

  48. Collins J D, Tullsen D M et al. Dynamic speculative precomputation. In the 34th ACM/IEEE Int. Symp. Microarchitecture, 2001, pp. 306–317.

  49. Jiwei Lu, Abhinav Das et al. Dynamic helper threaded prefetching on the Sun UltraSPARC CMP processor. In the 38th Micro., Oct. 2005, pp. 180–190.

  50. Dhruba Chandra, Fei Guo et al. Predicting inter-thread cache contention on a chip multi-processor architecture. HPCA-11. San Francisco, February 2005, pp. 340–351.

  51. Sundaramoorthy K, Purser Z, Rotenberg E. Slipstream processors: Improving both performance and fault tolerance. In Proc. the 9th ASPLOS, Cambridge, USA, 2000, pp. 257–268.

  52. Austin T M. Diva: A reliable substrate for deep submicron microarchitecture design. In Proc. 32nd Int. Symp. Microarchitecture, Haifa, Israel, 1999, pp. 196–207.

  53. Gomaa M, Scarbrough C et al. Transient-fault recovery for chip multiprocessors. In the 30th ISCA, San Diego, 2003, pp. 98–109.

  54. Rakesh Kumar, Victor Zyuban, Dean M Tullsen. Interconnections in multi-core architectures: Understanding mechanisms, overheads and scaling. In the 32nd ISCA, Wisconsin, USA, 2005, pp. 408–419.

  55. Rakesh Kumar, Dean Tullsen et al. Heterogeneous chip multiprocessors. IEEE Computer. November 2005, 38(11): 32–38.

  56. Weiwu Hu, Zhimin Tang. Microarchitecture design of the Godson-1 processor. Chinese Journal of Computers, April 2003, pp. 385–396. (in Chinese)

  57. Weiwu Hu, Fuxin Zhang, Zu-Song Li. Microarchitecture of the Godson-2 processor. J. Comput. Sci. Technol., 2005, 20(2): 243–249.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei-Wu Hu.

Additional information

Survey: Supported by the National Natural Science Foundation of China for Distinguished Young Scholar under Grant No. 60325205, the National High Technology Development 863 Program of China under Grants No. 2002AA110010, No. 2005AA110010, No. 2005AA119020, and the National Grand Fundamental Research 973 Program of China under Grant No. 2005CB321600.

Wei-Wu Hu received his B.S. degree from University of Science and Technology of China in 1991 and his Ph.D. degree from the Institute of Computing Technology, the Chinese Academy of Sciences in 1996, both in computer science. He is currently a professor in the Institute of Computing Technology. His research interests include high performance computer architecture, parallel processing and VLSI design.

Rui Hou is currently a Ph.D. candidate in Institute of Computing Technology, the Chinese Academy of Sciences. His research interest includes high performance computer architecture.

Jun-Hua Xiao is currently a Ph.D. candidate in Institute of Computing Technology, the Chinese Academy of Sciences. His research interest includes high performance computer architecture.

Long-Bing Zhang received the Ph.D. degree from the University of Science and Technology of China in 2002. And he completed the postdoctoral work in Institute of Computing Technology, Chinese Academy of Sciences, in 2004. He is now an associate researcher in Institute of Computing Technology, Chinese Academy of Sciences. His research interests include microprocessor design, system architecture, and cluster computing.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hu, WW., Hou, R., Xiao, JH. et al. High Performance General-Purpose Microprocessors: Past and Future. J Comput Sci Technol 21, 631–640 (2006). https://doi.org/10.1007/s11390-006-0631-6

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-006-0631-6

Keywords

Navigation