Weld: A Multithreading Technique Towards Latency-tolerant VLIW Processors

Özer, Emre; Conte, Thomas M.; Sharma, Saurabh

doi:10.1007/3-540-45307-5_17

Weld: A Multithreading Technique Towards Latency-tolerant VLIW Processors

Emre Özer⁷,
Thomas M. Conte⁷ &
Saurabh Sharma⁷

Conference paper
First Online: 01 January 2001

358 Accesses
6 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2228))

Abstract

This paper presents a new architecture model, named Weld, for VLIW processors. Weld integrates multithreading support into a VLIW processor to hide run-time latency effects that cannot be determined by the compiler. It does this through a novel hardware technique called operation welding that merges operations from different threads to utilize the hardware resources more efficiently. Hardware contexts such as program counters and fetch units are duplicated to support multithreading. The experimental results show that the Weld architecture attains a maximum of 27% speedup as compared to a single-threaded VLIW architecture. A MultiOp is a group of instructions that can be potentially executed in parallel.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

P. K. Dubey, K. O’Brien, K. M. O’Brien and C. Barton, “Single-Program Speculative Multithreading (SPSM) Architecture: Compiler-Assisted Fine-Grained Multithreading,” in Proc. Int’l Conf. Parallel Architecture and Compilation Techniques. (Cyprus). June 1995.
Google Scholar
H. Akkary and M. A. Driscoll, “A Dynamic Multithreading Processor,” in Proc. 31st Ann. Int’l Symp. Microarchitecture, Nov. 1998.
Google Scholar
G. S. Sohi, S. E. Breach and T. N. Vijaykumar, “ Multiscalar Processors,” in Proc.22nd Ann. Int’l Symp. Computer Architecture. (Italy). May 1995.
Google Scholar
S. Wallace, B. Calder and D. M. Tullsen, “Threaded Multiple Path Execution,” in Proc. 25th Ann. Int’l Symp. Computer Architecture, Barcelona, Spain, June 1998.
Google Scholar
D. M. Tullsen, S. J. Eggers and H. M. Levy, “Simultaneous Multithreading: Maximizing On-chip Parallelism,” in Proc. 22nd Ann. Int’l Symp. Computer Architecture, Italy, May 1995.
Google Scholar
G. Prasadh and C. Wu, “A Benchmark Evaluation of a Multithreaded RISC Processor Architecture,” in Proc. of Int’l Conf. on Parallel Processing, Aug. 1991.
Google Scholar
S. W. Keckler and W. J. Dally, “Processor Coupling: Integrating Compile Time and Runtime Scheduling for Parallelism,” in Proc. 19th Ann. Int’l Symp. Computer Architecture, Australia, May 1992.
Google Scholar
M. Fillo, S. W. Keckler, W. J. Dally, N.P. Carter, A. Chang, Y. Gurevich and W.S. Lee, “The M-Machine Multicomputer,” in Proc.28th Ann. Int’l Symp. Microarchitecture, Ann Arbor, MI, Dec. 1995.
Google Scholar
A. Wolfe and J.P. Shen,“ A Variable Instruction Stream Extension to the VLIW Architecture,” in Proc. 4th Int’l Conf. on Architectural Support for Programming Languages and Operating Systems, ACM Press, Apr. 1991.
Google Scholar
W. A. Havanki, “Treegion Scheduling for VLIW Processors”, Master’s Thesis, Dept. of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC 27695-7911, July 1997.
Google Scholar
W. A. Havanki, S. Banerjia and T. M. Conte, “Treegion Scheduling for Wideissue Processors”, in Proc. 4th Int’l Symp. High Performance Computer Architecture, Las Vegas NV, Feb. 1998.
Google Scholar
B. R. Rau, “Dynamically Scheduled VLIW Processors,” Proc. 26th Ann. Int’l Symp. Microarchitecture, Dec 1993.
Google Scholar
J. E. Smith and A. R. Pleszkun, “Implementing Precise Interrupts in Pipelined Processors”, in IEEE Trans. on Computers, Vol. 37, NO. 5, May 1988.
Google Scholar
M. Franklin and G. S. Sohi, “The Expandable Split Window Paradigm for Exploiting Fine-grain Parallelism”, Proc. 19th Ann. Int’l Symp. Computer Architecture, Gold Coast, Australia, May 1992.
Google Scholar
M. Franklin and G. S. Sohi, “ARB: A Hardware Mechanism for Dynamic Reordering of Memory References”, in IEEE Trans. on Computers, May 1996.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, North Carolina State University, 27695, Raleigh, NC
Emre Özer, Thomas M. Conte & Saurabh Sharma

Authors

Emre Özer
View author publications
You can also search for this author in PubMed Google Scholar
Thomas M. Conte
View author publications
You can also search for this author in PubMed Google Scholar
Saurabh Sharma
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Mathematics and Computer Science, University of Paderborn, Fürstenallee 11, 33102, Paderborn, Germany
Burkhard Monien
Department of EE-Systems, Computer Engineering Division, University of Southern California, 3740 McClintock Avenue, EEB 200C, 90089-2562, Los Angeles, CA, USA
Viktor K. Prasanna
Independent Consultant, c/o Infosys Ltd., “Mangala”, Kuloor Ferry Road, Kottara, 575006, Mangalore, India
Sriram Vajapeyam

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Özer, E., Conte, T.M., Sharma, S. (2001). Weld: A Multithreading Technique Towards Latency-tolerant VLIW Processors. In: Monien, B., Prasanna, V.K., Vajapeyam, S. (eds) High Performance Computing — HiPC 2001. HiPC 2001. Lecture Notes in Computer Science, vol 2228. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45307-5_17

Download citation

DOI: https://doi.org/10.1007/3-540-45307-5_17
Published: 04 December 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43009-4
Online ISBN: 978-3-540-45307-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics