Abstract
In this chapter we trace the history of computer architecture, focusing on the evolution of techniques for instruction-level parallelism. After briefly summarizing the early years of machine design, we focus on the development of out-of-order, pipelined, and multiple-issue processors. These are further divided into processors that do instruction scheduling entirely in hardware (e.g., superscalar machines) and those that expose the instruction scheduling to the compiler, particularly VLIW machines such as the Multiflow Trace, Cydra 5, and Itanium.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Bibliography
G.M. Amdahl, G.A. Blaauw, and F.P. Brooks Jr. Architecture of the IBM System/360. IBM Journal of Research and Development, 8(2):87–101, April 1964.
T. Agerwala and J. Cocke. High performance reduced instruction-set computers. Research Report RC-12434, IBM Thomas J. Watson Research Center, Yorktown Heights, NY, 1987.
D. A. Adams. A computation model with data flow sequencing. Technical Report CS 119, Dept. of Computer Science, Stanford University, December 1968.
Arvind and V. Kathail. A multiple processor data flow machine that supports generalized procedures. In Proceedings of the 8th Annual Symposium on Computer Architecture, pages 291–302, Minneapolis, MN, 1981.
G. M. Amdahl. The structure of system/360, Part III: Processing unit design considerations. IBM Journal of Research and Development, 3(2):144–164, 1964.
SHARC Processors. http://www.analog.com/en/processors-dsp/sharc/products/index.html.
Apollo Computer, Inc., Chelmsford, MA. The Series 10000 personal supercomputer: Inside a New Architecture, Publication No. 002402-007 2–88, 1988.
D. Anderson, F. Sparacio, and R. Tomasulo. The IBM System/360 model 91: Machine philosophy and instruction-handling. IBM Journal of Research and Development, 11(1):8–24, 1967.
M. Atkins. Performance and the i860 microprocessor. IEEE Micro, 11(5):24–27, 72–78, 1991.
R. W. Allard, K. A. Wolf, and R. A. Zemlin. Effects of the 6600 computer on language structures. Communications of the ACM, 7(2):112–119, 1964.
J. L. Baer. Computer Systems Architecture. Computer Press, 1980.
H. B. Bakoglu, G. F. Grohoski, and R. K. Montoye. The IBM RISC System/6000 processor: hardware overview. IBM Systems Journal, 34(1):12–22, 1990.
J. Bruno, J. W. Jones, and K. So. Deterministic scheduling with pipelined processors. IEEE Computer, C-29(4): 308–316, April 1980.
G. Blanck and S. Krueger. The SuperSPARCTM microprocessor. In Proceedings of the COMPCON, pages 136–141, February 1992.
E. Block. The engineering design of the STRETCH computer. In Proceedings of EJCC, pages 48–59, 1959.
C. Basoglu, W. Lee, and J. Setel O’Donnell. The MAP1000A VLIW mediaprocessor. IEEE Micro, 20(2):48–59, March 2000.
G. R. Beck, D. W. L. Yen, and T. L. Anderson. The Cydra 5 minisupercomputer: Architecture and implementation. The Journal of Supercomputing, 7(1–2):143–180, 1993.
Tensilica Introduces Xtensa LX2 and Xtensa 7 Configurable Processors. http://ip.cadence.com/news/191/330/Tensilica-Introduces-Xtensa-LX2-and-Xtensa-7-Configurable-Processors.htm.
B. B. Clayton, E. K. Dorff, and R. E. Fagen. An operating system and programming systems for the 6600. In American Federation of Information Processing Societies (AFIPS) Proceedings of the Fall Joint Computer Conference (FJCC), part 2, 26, pages 41–57, 1964.
S. C. Chen and D. J. Kuck. Time and parallel processor bounds for linear recurrence systems. IEEE Transactions on Computers, C-24(7):701–717, July 1975.
M. Cornish. The TI dataflow architectures: The power of concurrency for avionics. In Proceedings of the 3rd Conference on Digital Avionics Systems, pages 19–25, Fort Worth, TX, November 1979.
R. Colwell and R. Steck. A 0.6μm BiCMOS microprocessor with dynamic execution. In Proceedings ISSCC ’95 - International Solid-State Circuits Conference, pages 176–177, San Francisco, CA, 1995.
B. E. Carpenter, A. M. Turing, and M. Woodger. A. M. Turing’s ACE Report of 1946 and other papers. The MIT Press, Cambridge, MA, 1986.
K. Diefendorff and M. Allen. Organization of the Motorola 88110 superscalar RISC microprocessor. IEEE Micro, 12(2):40–63, March 1992.
R. L. Davis. The ILLIAC IV processing element. IEEE Transactions on Computers, C-18(9):800–816, September 1969.
E. S. Davidson. The design and control of pipelined function generators. In Proceedings of International Conference on System Networks and Computers, pages 19–21, 1971.
E. S. Davidson. Scheduling for pipelined processors. In Proceedings of 7th Hawaii Conference on System Sciences, pages 58–60, 1974.
A. L. Davis. The architecture and system method of DDM1: A recursively structured data driven machine. In Proceedings of the 5th Annual Symposium on Computer architecture, pages 210–215, 1978.
J. B. Dennis. First version of a data-flow procedural language. In Proceedings of the Colloque sur la Programmation, VOLUME 19 of Lecture Notes in Computer Science, pages 362–376. Springer-Verlag, 1974.
J. B. Dennis. The evolution of ‘static’ data-flow architecture. In J. L. Gaudiot and L. Bic, editors, Advanced Topics in Data-Flow Computing, Chapter 2. Prentice-Hall, 1991.
J. B. Dennis, J. B. Fosseen, and J. P. Linderman. Data flow schemas. In Proceedings of the International Symposium on Theoretical Programming, pages 187–216, 1974.
J. B. Dennis and G. R. Gao. Maximum pipelining of array operations on static data flow machine. In Proceedings of the 1983 International Conference on Parallel Processing, pages 331–334, August 1983.
D. A. Dunn and W. C. Hsu. Instruction scheduling for the HP PA-8000. In Proceedings of the 29th International Symposium of Microarchitecture MICRO-29, pages 298–307, Paris, France, 1996.
J. C. Dehnert, P. Y. T. Hsu, and J. P. Bratt. Overlapped loop support in the Cydra 5. In Proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-III), pages 26–38, Boston, MA, 1989.
A. L. Davis and R. M. Keller. Dataflow program graphs. IEEE Computer, 15(2):26–41, February 1982.
J. B. Dennis and D. P. Misunas. A preliminary architecture for a basic data-flow processor. In Proceedings of the 2nd Annual Symposium on Computer architecture, pages 126–132, 1975.
J. J. Dongarra. A survey of high performance computers. In Proceedings of the COMPCON, pages 8–11, March 1986.
E. S. Davidson, L. E. Shar, A. T. Thomas, and J. H. Patel. Effective control for pipelined computers. In Proceedings of COMPCON, pages 181–184, San Francisco, CA, February 1975.
E. DeLano, W. Walker, J. Yetter, and M. Forsyth. A high speed superscalar PA-RISC processor. In Proceedings of the COMPCON, pages 116–121, February 1992.
J. P. Eckert, J. C. Chu, A. B. Tonik, and W. F. Schmitt. Design of UNIVAC-LARC system I. In Proceedings of 1959 Eastern Joint Computer Conference, pages 59–65, New York, NY, 1959.
Elbrus 8C with eight cores should be 250 GFlops reach. http://translate.yandex.com/translate?lang=de-en&url=http://www.golem.de/news/russischer-prozessor-elbrus-8c-mit-acht-kernen-soll-250-gflops-erreichen-1407-107869.html.
The Elbrus-2: a Soviet-era high performance computer. http://www.computerhistory.org/atchm/the-elbrus-2-a-soviet-era-high-performance-computer/.
J. A. Fisher, P. Faraboschi, and C. Young. Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools. Morgan Kaufmann Publishers, 2004.
J. A. Fisher. VLIW architectures: Supercomputing via overlapped execution. In Proceedings of the Second International Conference on Supercomputing, Santa Barbara, CA, May 1987.
J. A. Fisher. VLIW architectures: an inevitable standard for the future? Supercomputer, 7(2):29–36, 1990.
J. A. Fisher and J. J. O’Donnell. VLIW Machines: Multiprocessors we can actually program. In Spring CompCon 84 Digest of Papers, February 1984.
C. C. Foster and E. M. Riseman. Percolation of code to enhance parallel dispatching and execution. IEEE Transactions on Computers, C-21(12):1411–1415, December 1972.
Floating Point Systems, Inc., Beaverton, OR. FPS AP-120B Processor HandBOOK, 1979.
S. Fuller. Motorola’s AltiVec TM Technology. Freescale Semiconductor, Inc, January 1998.
V. G. Grafe, G. S. Davidson, J. E. Hoch, and V. P. Holmes. The Epsilon dataflow processor. In Proceedings of the 16th Annual International Symposium on Computer architecture, pages 36–45, Jerusalem, Israel, 1989.
Pawel Gepner. Overview of ia-64 explicitly parallel instruction computing architecture. In Roman Wyrzykowski, Jack Dongarra, Marcin Paprzycki, and Jerzy Waniewski, editors, Parallel Processing and Applied Mathematics, volume 2328 of Lecture Notes in Computer Science, pages 331–339. 2002.
T. Gross, H. T. Kung, M. Lam, and J. Webb. Warp as a machine for low-level vision. In Proceedings of IEEE International Conference on Robotics and Automation, pages 790–800, March 1985.
J. R Gurd, C. C Kirkham, and I. Watson. The Manchester prototype dataflow computer. Communications of the ACM, 28(1):34–52, January 1985.
J. Gregory and R. McReynolds. The SOLOMON computer. IEEE Transactions on Electronic Computers, EC-12:774–781, December 1963.
P. B. Gibbons and S. S. Muchnick. Efficient instruction scheduling for a pipelined architecture. In Proceedings of the 1986 SIGPLAN Symposium on Compiler contruction, pages 11–16, Palo Alto, CA, 1986.
T. Gross and D. R. O’Hallaron. iWarp: Anatomy of a Parallel Computing System. The MIT Press, Cambridge, MA, March 1998.
T. R. Gross. Code optimization techniques for pipelined architectures. In Proceedings of the 1983 Spring COMPCON, pages 278–285, San Francisco, CA, March 1983.
G.F. Grohoski. Machine organization of the IBM RISC System/6000 processor. IBM Systems Journal, 34(1): 37–58, 1990.
J. L. Hennessy and T. R. Gross. Code generation and reorganization in the presence of pipeline constraints. In Proceedings of the 9th ACM SIGPLAN-SIGACT Symposium on Principles of programming languages, pages 120–127, Albuquerque, Mexico, 1982.
L. C. Higbie. Overlapped operation with microprogramming. IEEE Transactions on Computers, 27(3):270–275, 1978.
J. L. Hilburn and P. N. Julich. Microcomputers/Microprocessors: Hardware, Software, and Applications. Prentice-Hall, Englewood Cliff, NJ, 1976.
R. W. Hockney and C. R. Jesshope. Parallel Computers: Architecture, Programming and Algorithms. IOP Publishing Ltd., Bristol, UK, 1981.
R. W. Hockney and C. R. Jesshope. Parallel Computers 2: Architecture, Programming and Algorithms. IOP Publishing Ltd., Bristol, UK, 1988.
E. Hogenauer, R. F. Newbold, and Y. J. Inn. DDSP – a data flow computer for signal processing. In Proceedings of the 1982 International Conference on Parallel Processing, pages 126–133, 1982.
J. Hennessy and D. Patterson. Computer Architecture A Quantitative Approach. Morgan Kaufmann Publishers, San Mateo, CA, 1990.
R. G. Hintz and D. P. Tate. Control Data STAR-100 processor design. In Proceedings of the COMPCON, pages 221–228, September 1972.
K. Hwang. Advanced Computer Architecture: Parallelism,Scalability,Programmability. McGraw-Hill, 1993.
IBM Corporation, Endicott, NY. IBM 3838 Array Processor Functional Characteristics, Publication No. 6A24-3639-0, File No. S370-08, 1976.
Infineon introduces configurable CARMEL DSP Core for 3G wireless and broadband communication applications. http://www.infineon.com/cms/en/corporate/press/news/releases/2000/129311.html.
Intel Corporation, Santa Clara, CA. i860 64-bit Microprocessor Programmer’s Reference Manual, Publication No. 240329-001, 1989.
M. Johnson. Superscalar Microprocessor Design. Prentice-Hall, New Jersey, 1991.
N. Jouppi and D. Wall. Available instruction-level parallelism for superscalar and superpipelined machines. In Proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-III), pages 272–282, Boston, MA, April 1989.
M. G. H. Katevenis. Reduced Instruction Set Computer Architectures for VLSI. The MIT Press, 1985 1985.
R. M. Keller. Look-ahead processors. ACM Computing Surveys, 7(4):177–195, December 1975.
R. Karp and R. Miller. Properties of a model for parallel computation: determinacy, termination, queueing. SIAM Journal of Applied Mathematics, 4(6):1390–1411, November 1966.
L. Kohn and N. Margulis. Introducing the Intel i860 64-bit microprocessor. IEEE Micro, 9(4):15–30, July 1989.
Les Kohn and Neal Margulis. Introducing the Intel I860 64-Bit Microprocessor. IEEE Micro, 9(4):15–30, July 1989.
P. Kogge. Algorithm development for pipelined processors. In Proceedings of the 1977 International Conference on Parallel Processing, August 1977.
P. Kogge. The microprogramming of pipelined processors. In Proceedings of the Fourth International Symposium on Computer Architecture, pages 63–69, March 1977.
P. M. Kogge. The Architecture of Pipelined Computers. Hemisphere Publishing Corporation, 1981.
J. Koudela. The past, present, and future of minicomputers: A scenario. Proceedings of the IEEE, 61(11):1526–1534, November 1973.
M. G. H. Katevenis, R. W. Sherburne, D. A. Patterson, and C. H. Sèquin. The RISC II micro-architecture. In F. Anceau and E. J. Aas, editors, Proceedings of the VLSI 83 Conference, pages 349–359. Elsevier Science Publishers (IFIP), North-Holland, 1983.
A. Kumar. The HP PA-8000 RISC CPU. IEEE Micro, 17(2):27–32, 1997.
J. L. Larson. Cost-effective processor design with an application to FFT. Technical Report SU-SEL-73-037, Dept. of Computer Science, Stanford University, August 1973.
J. Labrousse and G. Slavenburg. A 50 MHz microprocessor with a VLIW architecture. In Proceedings of the International Solid State Circuits Conference, San Francisco, CA, 1990.
A. Lunde. Evaluation of instruction set processor architecture by program tracing. PhD thesis, 1975.
A. Lunde. Empirical evaluation of some features of instruction set processor architectures. Communications of the ACM, 20(3):143–153, March 1977.
S. McGeady. The i960CA superscalar implementation of the 80960 architecture. In Proceedings of the 35th IEEE COMPCON, pages 232–239, March 1990.
MPC7450 RISC Microprocessor Family Reference Manual. http://www.datasheetarchive.com/dl/Datasheets-SW1/DSASW0013006.pdf, January 2005.
T. E. Mankovich, V. Popescu, and H. Sullivan. Chopp principles of operation. In Proceedings of the Second International Conference on Supercomputing, pages 2–10, 1987.
D. B. Papworth. Tuning the Pentium Pro microarchitecture. IEEE Micro, 16(2):8–15, 1996.
J. H. Patel. Pipelines with internal buffers. In Proceedings of the 5th Annual Symposium on Computer architecture, pages 249–254, 1978.
David A. Patterson. Reduced instruction set computers. Communications of the ACM, 28(1):8–21, January 1985.
G. M. Papadopoulos and D. E. Culler. Monsoon: an explicit token-store architecture. In Proceedings of the 17th Annual Symposium on Computer Architecture, pages 293–302, 1990.
J. H. Patel and E. S. Davidson. Improving the throughput of a pipeline by insertion of delays. In Proceedings of the Third International Symposium on Computer Architecture, pages 159–164, 1976.
A. Plas. LAU system architecture: A parallel data driven processor based on single assignment. In Proceedings of the 1976 International Conference on Parallel Processing, pages 293–302, August 1976.
D. A. Patterson and C. H. Sequin. RISC I: A Reduced Instruction Set VLSI computer. In Proceedings of the 8th Annual Symposium on Computer Architecture, pages 443–457, Minneapolis, MN, 1981.
D. A. Patterson and C. H. S’equin. A VLSI RISC. IEEE Computer, 15(9):8–22, September 1982.
G. Radin. The 801 minicomputer. ACM SIGPLAN Notices, 17(4):39–47, April 1982.
B. R. Rau. Cydra 5 directed dataflow architecture. In Proceedings of COMPCON Spring’88, pages 106–113, San Francisco, CA, 1988.
J. E. Rodrigues and J. E. R. Bezos. A graph model for parallel computations. Technical Report 64, Massachusetts Institute of Technology, 1969.
J. F. Ruggiero and D. A. Coryell. An auxiliary processing system for array calculations. IBM Systems Journal, 8(2):118–135, 1969.
B. R. Rau and C. D. Glaeser. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing. In Proceedings of the 14th Annual Workshop on Microprogramming, pages 183–198, Chatham, MA, December 1981.
B. R. Rau, C. D. Glaeser, and R. L. Picard. Efficient code generation for horizontal architectures: Compiler techniques and architectural support. In Proceedings of the Ninth International Symposium on Computer Architecture, pages 131–139, Austin, TX, 1982.
C. V. Ramamoorthy and H. F. Li. Pipeline architecture. ACM Computing Surveys, 9(1), 1977.
R. M. Russell. The CRAY-1 computer system. Communications of the ACM, 21(1):63–72, January 1978.
J. W. Rymarczyk. Coding guidelines for pipelined processors. In Proceedings of the First Symposium on Architectural Support for Programming Languages and Operation Systems (ASPLOS), pages 12–19, Palo Alto, CA, 1982.
B. R. Rau, W. L. Yen, W. Yen, and A. Towle. The Cydra 5 departmental supercomputer: design philosophies, decisions and trade-offs. IEEE Computer, 22(1):12–35, January 1989.
H. Sharangpani and K. Arora. Itanium processor microarchitecture. IEEE Micro, 20(5):24–43, September 2000.
D. P. Siewiorek, C. G. Bell, and A. Newell. Computer Structures: Principles and Examples. McGraw-Hill, New York, NY, 1982.
P. B. Schneck. Supercomputer Architecture. Kluwer Academic Publishers, Norwell, MA, 1987.
J. Schutz. A 3.3V 0.6μm BiCMOS superscaler microprocessor. In IEEE International Solid-State Circuits Conference, pages 202–203, February 1994.
L. E. Shar. Design and scheduling of statically configured pipelines. Technical Report 42, Digital Systems Laboratory, Stanford University, September 1972.
M. Schlansker and M. McNamara. The Cydra 5 computer system architecture. In Proceedings of the Sixth International Conference on Computer Design, pages 302–306, October 1988.
R. E. Smith. A historical overview of computer architecture. IEEE Annals of the History of Computing, 10(4):277–303, 1988.
J. E. Smith. Dynamic instruction scheduling and the astronautics ZS-1. IEEE Computer, 22(7):21–35, July 1989.
J. J. Shieh and C. Papachristou. On reordering instruction streams for pipelined computers. In Proceedings of the 22nd Annual Workshop on Microprogramming and microarchitecture, pages 199–206, Dublin, Ireland, 1989.
M. Schlansker and B. R. Rau. EPIC: An Architecture for Instruction-Level Parallel Processors. Technical Report 99–111, Hewlett Packard Laboratories, February 2000.
R. Simar and R. Tatge. How TI adopted VLIW in digital signal processors. Solid-State Circuits Magazine, IEEE, 1(3):10–14, Summer 2009.
H. S. Stone. High-performance computer architecture. Addison-Wesley, Reading, MA, third edition, 1993.
A. T. Thomas and E. S. Davidson. Scheduling for multiconfigurable pipelines. In Proceedings of 12th Annual Allerton Conference on Circuits and System Theory, pages 658–669, Allerton, IL, 1974.
G.S. Tjaden and M.J. Flynn. Detection and parallel execution of independent instructions. IEEE Transactions on Computers, C-19(10):889–895, 1970.
T. Temam, S. Hasegawa, and S. Hanaki. Dataflow procesor for image processing. In Proceedings of the International Symposium on Mini and Microcomputers, pages 52–56, 1980.
J. E. Thornton. Design of a Computer: The Control Data 6600. Glenview, IL: Scott, Foresman, 1970.
Tms320c6701 floating-point digital signal processor. http://www.ti.com/lit/ds/symlink/tms320c6701.pdf.
M. Tremblay and J. M. O’Connor. UltraSparc I: A four-issue processor supporting multimedia. IEEE Micro, 16(2):42–50, April 1996.
R. M. Tomasulo. An efficient algorithm for exploiting multiple arithmetic units. IBM Journal of Research and Development, 11:25–33, 1967.
A. Trew and G. Wilson. Past, present, parallel: a survey of available parallel computer systems. Springer-Verlag, New York, NY, 1991.
Arthur H. Veen. Dataflow machine architecture. ACM Computing Surveys, 18(4):365–396, 1986.
B. Ramakrishna Rau Vinod Kathail, Michael S. Schlansker. HPL Playdoh architecture specification: Version 1.0. Technical Report HPL-93-80, Hewlett Packard Laboratories, February 1994.
W. J. Watson. The TI ASC - a highly modular and flexible super computer architecture. In Proceedings of the AFIPS Fall Joint Conference, pages 34–52, 1972.
S. W. White and S. Dhawan. Power2: next generation of the RISC system/6000 family. 38(5):493–502, September 1994.
I. Watson and J. Gurd. A practical data flow computer. IEEE Computer, 15(2):51–57, February 1982.
D. L. Weaver and T. Germond. The SPARC Architecture Manual, April 2000.
M. V. Wilkes. The best way to design an automatic calcualting machine. In Proceedings of Manchester University Computer Inaugural Conference, pages 16–18, Manchester, England, July 1951.
M. V. Wilkes and J. B. Stringer. Microprogramming and the design of the control cicuits in an electronic digital computer. In Proceedings of the Cambridge Philosophical Society, Part2, pages 230–238, April 1953.
K. C. Yeager. The MIPS R10000 superscalar microprocessor. IEEE Micro, 16(2):28–40, April 1996.
W. C. Yen, D. W. L. Yen, and K. S. Fu. Data coherence problem in a multicache system. IEEE Transactions on Computers, C-34(1):56–65, January 1985.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer-Verlag US
About this chapter
Cite this chapter
Aiken, A., Banerjee, U., Kejariwal, A., Nicolau, A. (2016). Overview of ILP Architectures. In: Instruction Level Parallelism. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7797-7_2
Download citation
DOI: https://doi.org/10.1007/978-1-4899-7797-7_2
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4899-7795-3
Online ISBN: 978-1-4899-7797-7
eBook Packages: Computer ScienceComputer Science (R0)