Skip to main content

Weld: A Multithreading Technique Towards Latency-tolerant VLIW Processors

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2228))

Abstract

This paper presents a new architecture model, named Weld, for VLIW processors. Weld integrates multithreading support into a VLIW processor to hide run-time latency effects that cannot be determined by the compiler. It does this through a novel hardware technique called operation welding that merges operations from different threads to utilize the hardware resources more efficiently. Hardware contexts such as program counters and fetch units are duplicated to support multithreading. The experimental results show that the Weld architecture attains a maximum of 27% speedup as compared to a single-threaded VLIW architecture. A MultiOp is a group of instructions that can be potentially executed in parallel.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. P. K. Dubey, K. O’Brien, K. M. O’Brien and C. Barton, “Single-Program Speculative Multithreading (SPSM) Architecture: Compiler-Assisted Fine-Grained Multithreading,” in Proc. Int’l Conf. Parallel Architecture and Compilation Techniques. (Cyprus). June 1995.

    Google Scholar 

  2. H. Akkary and M. A. Driscoll, “A Dynamic Multithreading Processor,” in Proc. 31st Ann. Int’l Symp. Microarchitecture, Nov. 1998.

    Google Scholar 

  3. G. S. Sohi, S. E. Breach and T. N. Vijaykumar, “ Multiscalar Processors,” in Proc.22nd Ann. Int’l Symp. Computer Architecture. (Italy). May 1995.

    Google Scholar 

  4. S. Wallace, B. Calder and D. M. Tullsen, “Threaded Multiple Path Execution,” in Proc. 25th Ann. Int’l Symp. Computer Architecture, Barcelona, Spain, June 1998.

    Google Scholar 

  5. D. M. Tullsen, S. J. Eggers and H. M. Levy, “Simultaneous Multithreading: Maximizing On-chip Parallelism,” in Proc. 22nd Ann. Int’l Symp. Computer Architecture, Italy, May 1995.

    Google Scholar 

  6. G. Prasadh and C. Wu, “A Benchmark Evaluation of a Multithreaded RISC Processor Architecture,” in Proc. of Int’l Conf. on Parallel Processing, Aug. 1991.

    Google Scholar 

  7. S. W. Keckler and W. J. Dally, “Processor Coupling: Integrating Compile Time and Runtime Scheduling for Parallelism,” in Proc. 19th Ann. Int’l Symp. Computer Architecture, Australia, May 1992.

    Google Scholar 

  8. M. Fillo, S. W. Keckler, W. J. Dally, N.P. Carter, A. Chang, Y. Gurevich and W.S. Lee, “The M-Machine Multicomputer,” in Proc.28th Ann. Int’l Symp. Microarchitecture, Ann Arbor, MI, Dec. 1995.

    Google Scholar 

  9. A. Wolfe and J.P. Shen,“ A Variable Instruction Stream Extension to the VLIW Architecture,” in Proc. 4th Int’l Conf. on Architectural Support for Programming Languages and Operating Systems, ACM Press, Apr. 1991.

    Google Scholar 

  10. W. A. Havanki, “Treegion Scheduling for VLIW Processors”, Master’s Thesis, Dept. of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC 27695-7911, July 1997.

    Google Scholar 

  11. W. A. Havanki, S. Banerjia and T. M. Conte, “Treegion Scheduling for Wideissue Processors”, in Proc. 4th Int’l Symp. High Performance Computer Architecture, Las Vegas NV, Feb. 1998.

    Google Scholar 

  12. B. R. Rau, “Dynamically Scheduled VLIW Processors,” Proc. 26th Ann. Int’l Symp. Microarchitecture, Dec 1993.

    Google Scholar 

  13. J. E. Smith and A. R. Pleszkun, “Implementing Precise Interrupts in Pipelined Processors”, in IEEE Trans. on Computers, Vol. 37, NO. 5, May 1988.

    Google Scholar 

  14. M. Franklin and G. S. Sohi, “The Expandable Split Window Paradigm for Exploiting Fine-grain Parallelism”, Proc. 19th Ann. Int’l Symp. Computer Architecture, Gold Coast, Australia, May 1992.

    Google Scholar 

  15. M. Franklin and G. S. Sohi, “ARB: A Hardware Mechanism for Dynamic Reordering of Memory References”, in IEEE Trans. on Computers, May 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Özer, E., Conte, T.M., Sharma, S. (2001). Weld: A Multithreading Technique Towards Latency-tolerant VLIW Processors. In: Monien, B., Prasanna, V.K., Vajapeyam, S. (eds) High Performance Computing — HiPC 2001. HiPC 2001. Lecture Notes in Computer Science, vol 2228. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45307-5_17

Download citation

  • DOI: https://doi.org/10.1007/3-540-45307-5_17

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43009-4

  • Online ISBN: 978-3-540-45307-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics