Execution Efficiency of the Microthreaded Pipeline

Daněk, Martin; Kafka, Leoš; Kohout, Lukáš; Sýkora, Jaroslav; Bartosiński, Roman

doi:10.1007/978-1-4614-2410-9_7

Martin Daněk⁶,
Leoš Kafka⁶,
Lukáš Kohout⁶,
Jaroslav Sýkora⁶ &
…
Roman Bartosiński⁶

533 Accesses

Abstract

When analyzing execution efficiency of the microthreaded pipeline, we are interested in two key things:

1.
The number of stall clock cycles in the processing pipeline.
2.
The latency tolerance for blocking and non-blocking long-latency operations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Arvind K, Nikhil RS (1990) Executing a program on the MIT tagged-token dataflow architecture. IEEE Trans Comput 39(3):300–318
Google Scholar
Gaisler J Fault-tolerant microprocessors for space applications. http://www.gaisler.com/doc/vhdl2proc.pdf. Accessed September 7, 2012
Gaisler J (2001) The LEON processor user’s manual. Gaisler research
Google Scholar
Gaisler J, Catovic E, Habinc S (2007) GRLIB IP library user’s manual. Gaisler research
Google Scholar
Gaisler J, Catovic E, Isomaki M, Glembo K, Habinc S (2007) GRLIB IP core user’s manual. Gaisler research
Google Scholar
Guz Z, Bolotin E, Keidar I, Kolodny A, Mendelson A, Weiser UC (2009) Many-core vs. many-thread machines: stay away from the valley. IEEE Comput Archit Lett 8(1):25–28
Google Scholar
IEEE (1994) IEEE standard for a 32-bit microprocessor architecture (IEEE-Std 1754–1994). IEEE Computer Society press, New York
Google Scholar
Independent JPEG Group Independent JPEG Group. http://www.ijg.org/. Accessed September 7, 2012
Jesshope C (2004) Scalable instruction-level parallelism. In: Computer systems: architectures, modeling, and simulation. Springer, Berlin/Heidelberg, pp 383–392
Google Scholar
Jesshope CR (2006) μTC – an intermediate language for programming chip multiprocessors. In: Asia-pacific computer systems architecture conference. Springer, Berlin/Heidelberg, pp 147–160
Google Scholar
Jesshope CR, Luo B (2000) Micro-threading: a new approach to future RISC. In: Proceedings of the 5th Australasian computer architecture conference. IEEE Computer Society Press, Los Alamitos, pp 34–41
Google Scholar
Kissell KD (2008) MIPS MT: a multithreaded RISC architecture for embedded real-time processing. In: Stenström P, Dubois M, Katevenis M, Gupta R, Ungerer T (eds) High performance embedded architectures and compilers. Lecture notes in computer science, vol 4917. Springer, Berlin/Heidelberg, pp 9–21
Google Scholar
Kongentira P, Aingaran K, Olukotum K (2005) Niagara: a 32-way multithreaded SPARC processor. IEEE Micro 25(2):21–29
Google Scholar
Saavedra-Barrera RH, Culler DE, von Eicken T (1990) Analysis of multithreaded architectures for parallel computing. In: Proceedings of the second annual ACM symposium on parallel algorithms and architectures, SPAA ’90, Island of Crete, Greece. ACM, New York, pp 169–178
Google Scholar
SPARC (1992) SPARC architecture manual, Version 8. SPARC International, Inc.
Google Scholar
Takayanagi T, Shin JL, Petrick B, Su J, Leon AS (2004) A dual-core 64b ULTRASPARC microprocessor for dense server applications. In: Malik S, Fix L, Kahng AB (eds) Proceedings of the 41st annual design automation conference, DAC ’04, San Diego, CA. ACM, New York, pp 673–677
Google Scholar
The Apple-CORE Consortium Architecture paradigms and programming languages for efficient programming of multiple COREs. http://www.apple-core.info. Accessed September 7, 2012
Ungerer T, Robič B, Šilc J (2003) A survey of processors with explicit multithreading. ACM Comput Surv 35(1):29–63
Google Scholar
Waldspurger CA, Weihl WE (1993) Register relocation: flexible contexts for multithreading. In: Proceedings of the 20th annual international symposium on computer architecture, ISCA ’93, San Diego, CA. ACM, New York, pp 120–130
Google Scholar
Xilinx Xilinx university program xupv5-lx110t development system. http://www.xilinx.com/univ/xupv5-lx110t.htm. Accessed September 7, 2012

Download references

Author information

Authors and Affiliations

Signal Processing, ÚTIA AV ČR, v.v.i., Pod Vodárenskou věžzí 1143/4, Praha 8, Czech Republic
Martin Daněk, Leoš Kafka, Lukáš Kohout, Jaroslav Sýkora & Roman Bartosiński

Authors

Martin Daněk
View author publications
You can also search for this author in PubMed Google Scholar
Leoš Kafka
View author publications
You can also search for this author in PubMed Google Scholar
Lukáš Kohout
View author publications
You can also search for this author in PubMed Google Scholar
Jaroslav Sýkora
View author publications
You can also search for this author in PubMed Google Scholar
Roman Bartosiński
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Daněk, M., Kafka, L., Kohout, L., Sýkora, J., Bartosiński, R. (2013). Execution Efficiency of the Microthreaded Pipeline. In: UTLEON3: Exploring Fine-Grain Multi-Threading in FPGAs. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-2410-9_7

Download citation

DOI: https://doi.org/10.1007/978-1-4614-2410-9_7
Published: 21 August 2012
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-2409-3
Online ISBN: 978-1-4614-2410-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics