Programmable and Scalable Architecture for Graphics Processing Units

de La Lama, Carlos S.; Jääskeläinen, Pekka; Takala, Jarmo

doi:10.1007/978-3-642-03138-0_2

Carlos S. de La Lama¹⁹,
Pekka Jääskeläinen²⁰ &
Jarmo Takala²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5657))

Included in the following conference series:

International Workshop on Embedded Computer Systems

804 Accesses
2 Citations

Abstract

Graphics processing is an application area with high level of parallelism at the data level and at the task level. Therefore, graphics processing units (GPU) are often implemented as multiprocessing systems with high performance floating point processing and application specific hardware stages for maximizing the graphics throughput.

In this paper we evaluate the suitability of Transport Triggered Architectures (TTA) as a basis for implementing GPUs. TTA improves scalability over the traditional VLIW-style architectures making it interesting for computationally intensive applications. We show that TTA provides high floating point processing performance while allowing more programming freedom than vector processors.

Finally, one of the main features of the presented TTA-based GPU design is its fully programmable architecture making it suitable target for general purpose computing on GPU APIs which have become popular in recent years.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Stephens, R.: A survey of stream processing. Acta Informatica 34(7), 491–541 (1997)
Article MathSciNet MATH Google Scholar
Crow, T.S.: Evolution of the Graphical Processing Unit. Master’s thesis, University of Nevada, Reno, NV (December 2004)
Google Scholar
St-Laurent, S.: The Complete Effect and HLSL Guide. Paradoxal Press (2005)
Google Scholar
Kessenich, J.: The OpenGL Shading Language. 3DLabs, Inc. (2006)
Google Scholar
Luebke, D., Humphreys, G.: How GPUs work. Computer 40(2), 96–100 (2007)
Article Google Scholar
Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Krüger, J., Lefohn, A.E., Purcell, T.J.: A Survey of General-Purpose Computation on Graphics Hardware. Computer Graphics Forum 26(1), 80–113 (2007)
Article Google Scholar
Khronos Group: OpenCL 1.0 Specification (Februrary 2009), http://www.khronos.org/registry/cl/
Halfhill, T.R.: Parallel Processing with CUDA. Microprocessor Report (January 2008)
Google Scholar
Lindholm, E., Nickolls, J., Oberman, S., Montrym, J.: NVIDIA Tesla: A unified graphics and computing architecture. IEEE Micro. 28(2), 39–55 (2008)
Article Google Scholar
Wasson, S.: NVIDIA’s GeForce 8800 graphics processor. Tech. Report (November 2007)
Google Scholar
Wasson, S.: AMD Radeon HD 2900 XT graphics processor: R600 revealed. Tech Report (May 2007)
Google Scholar
Moya, V., González, C., Roca, J., Fernández, A., Espasa, R.: Shader Performance Analisys on a Modern GPU Architecture. In: 38th IEEE/ACM Int. Symp. Microarchitecture, Barcelona, Spain, November 12-16. IEEE Computer Society, Los Alamitos (2005)
Google Scholar
Seiler, L., Carmean, D., Sprangle, E., Forsyth, T., Abrash, M., Dubey, P., Junkins, S., Lake, A., Sugerman, J., Cavin, R., Espasa, R., Grochowski, E., Juan, T., Hanrahan, P.: Larrabee: A Many-Core x86 Architecture for Visual Computing. ACM Transactions on Graphics 27(18) (August 2008)
Google Scholar
Segal, M., Akeley, K.: The OpenGL Graphics System: A Specification. Silicon Graphics, Inc. (2006)
Google Scholar
Colwell, R.P., Nix, R.P., O’Donnell, J.J., Papworth, D.B., Rodman, P.K.: A VLIW architecture for a trace scheduling compiler. In: ASPLOS-II: Proc. second int. conf. on Architectual support for programming languages and operating systems, pp. 180–192. IEEE Computer Society Press, Los Alamitos (1987)
Google Scholar
Corporaal, H.: Microprocessor Architectures: from VLIW to TTA. John Wiley & Sons, Chichester (1997)
Google Scholar
Corporaal, H.: TTAs: missing the ILP complexity wall. Journal of Systems Architecture 45(12-13), 949–973 (1999)
Article Google Scholar
Hoogerbrugge, J., Corporaal, H.: Register file port requirements of Transport Triggered Architectures. In: MICRO 27: Proc. 27th Int. Symp. Microarchitecture, pp. 191–195. ACM Press, New York (1994)
Chapter Google Scholar
Jääskeläinen, P., Guzma, V., Cilio, A., Takala, J.: Codesign toolset for application-specific instruction-set processors. In: Proc. Multimedia on Mobile Devices 2007, pp. 65070X–1 — 65070X–11 (2007), http://tce.cs.tut.fi/
Lattner, C., Adve, V.: LLVM: A compilation framework for lifelong program analysis & transformation. In: Proc. Int. Symp. Code Generation and Optimization, Palo Alto, CA, March 20-24, p. 75 (2004)
Google Scholar
Hennessy, J.L., Patterson, D.A.: Computer Architecture: A Quantitative Approach, 3rd edn. Morgan Kaufmann Publishers Inc., San Francisco (2003)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Architecture, Computer Science and Artificial Intelligence, Universidad Rey Juan Carlos, C/ Tulipán s/n, 28933 Móstoles, Madrid, Spain
Carlos S. de La Lama
Department of Computer Systems, Tampere University of Technology, Korkeakoulunkatu 10, 33720, Tampere, Finland
Pekka Jääskeläinen & Jarmo Takala

Authors

Carlos S. de La Lama
View author publications
You can also search for this author in PubMed Google Scholar
Pekka Jääskeläinen
View author publications
You can also search for this author in PubMed Google Scholar
Jarmo Takala
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Delft University of Technology, Mekelweg 4, 2628, Delft, CD, The Netherlands
Koen Bertels & Stephan Wong &
Department of Electrical and Computer Engineering, University of Victoria, P.O. Box 3055, V8W 3P6, Victoria, BC, Canada
Nikitas Dimopoulos
Dipartimento di Elettronica e Informazione, Politecnico di Milano, P.za Leonardo Da Vinci 32, 20133, Milan, Italy
Cristina Silvano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

de La Lama, C.S., Jääskeläinen, P., Takala, J. (2009). Programmable and Scalable Architecture for Graphics Processing Units. In: Bertels, K., Dimopoulos, N., Silvano, C., Wong, S. (eds) Embedded Computer Systems: Architectures, Modeling, and Simulation. SAMOS 2009. Lecture Notes in Computer Science, vol 5657. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03138-0_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-03138-0_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03137-3
Online ISBN: 978-3-642-03138-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics