Automatic Calibration of Performance Models on Heterogeneous Multicore Architectures

Augonnet, Cédric; Thibault, Samuel; Namyst, Raymond

doi:10.1007/978-3-642-14122-5_9

Cédric Augonnet⁸,
Samuel Thibault⁸ &
Raymond Namyst⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6043))

Included in the following conference series:

European Conference on Parallel Processing

1425 Accesses
12 Citations

Abstract

Multicore architectures featuring specialized accelerators are getting an increasing amount of attention, and this success will probably influence the design of future High Performance Computing hardware. Unfortunately, programmers are actually having a hard time trying to exploit all these heterogeneous computing units efficiently, and most existing efforts simply focus on providing tools to offload some computations on available accelerators. Recently, some runtime systems have been designed that exploit the idea of scheduling – as opposed to offloading – parallel tasks over the whole set of heterogeneous computing units. Scheduling tasks over heterogeneous platforms makes it necessary to use accurate prediction models in order to assign each task to its most adequate computing unit [2]. A deep knowledge of the application is usually required to model per-task performance models, based on the algorithmic complexity of the underlying numeric kernel.

We present an alternate, auto-tuning performance prediction approach based on performance history tables dynamically built during the application run. This approach does not require that the programmer provides some specific information. We show that, thanks to the use of a carefully chosen hash-function, our approach quickly achieves accurate performance estimations automatically. Our approach even outperforms regular algorithmic performance models with several linear algebra numerical kernels.

Download to read the full chapter text

Chapter PDF

Multicore Performance Prediction with MPET

Article Open access 01 July 2020

Catwalk: A Quick Development Path for Performance Models

Performance Patterns and Hardware Metrics on Modern Multicore Processors: Best Practices for Performance Engineering

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Augonnet, C., Namyst, R.: A unified runtime system for heterogeneous multicore architectures. In: Euro-Par 2008 Workshops - HPPC’08, Las Palmas de Gran Canaria, Spain (August 2008)
Google Scholar
Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.A.: StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures. In: Sips, H., Epema, D., Lin, H.-X. (eds.) Euro-Par 2009 Parallel Processing. LNCS, vol. 5704, pp. 863–874. Springer, Heidelberg (2009)
Chapter Google Scholar
Bellens, P., Perez, J.M., Badia, R.M., Labarta, J.: Cellss: a programming model for the cell be architecture. In: Proceedings of SC’06, Tampa, Florida (2006)
Google Scholar
Clint Whaley, R., Dongarra, J.: Automatically Tuned Linear Algebra Software. In: Proceedings of SIAM PP’99, San Antonio, Texas (March 1999)
Google Scholar
Diamos, G., Yalamanchili, S.: Harmony: Runtime Techniques for Dynamic Concurrency Inference, Resource Constrained Hierarchical Scheduling, and Online Optimization in Heterogeneous Multiprocessor Systems. Technical report, Georgia Institute of Technology, Computer Architecture and Systems Lab (2008)
Google Scholar
Duran, A., Perez, J.M., Ayguade, E., Badia, R., Labarta, J.: Extending the openmp tasking model to allow dependant tasks. In: Eigenmann, R., de Supinski, B.R. (eds.) IWOMP 2008. LNCS, vol. 5004, pp. 111–122. Springer, Heidelberg (2008)
Chapter Google Scholar
Fatahalian, K., Knight, T.J., Houston, M., Erez, M., Reiter Horn, D., Leem, L., Young Park, J., Ren, M., Aiken, A., Dally, W.J., Hanrahan, P.: Sequoia: Programming the memory hierarchy. In: Proceedings of SC’06, Tampa, Florida (2006)
Google Scholar
Jiménez, V.J., Vilanova, L., Gelado, I., Gil, M., Fursin, G., Navarro, N.: Predictive runtime code scheduling for heterogeneous architectures. In: Seznec, A., Emer, J., O’Boyle, M., Martonosi, M., Ungerer, T. (eds.) HiPEAC 2009. LNCS, vol. 5409, pp. 19–33. Springer, Heidelberg (2009)
Chapter Google Scholar
Li, Y., Dongarra, J., Tomov, S.: A Note on Auto-tuning GEMM for GPUs. In: Proceeding of ICCS’09, Baton Rouge, Louisiana, U.S.A. (2009)
Google Scholar
McCool, M.D.: Data-Parallel Programming on the Cell BE and the GPU using the RapidMind Development Platform. In: GSPx’06 Multicore Applications Conference (2006)
Google Scholar
Tomov, S., Dongarra, J., Baboulin, M.: Towards Dense Linear Algebra for Hybrid GPU Accelerated Manycore Systems. Technical report (January 2009)
Google Scholar
Topcuoglu, H., Hariri, S., Wu, M.-Y.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Transactions on Parallel and Distributed Systems 13(3), 260–274 (2002)
Article Google Scholar

Download references

Author information

Authors and Affiliations

INRIA Bordeaux, LaBRI, University of Bordeaux, France
Cédric Augonnet, Samuel Thibault & Raymond Namyst

Authors

Cédric Augonnet
View author publications
You can also search for this author in PubMed Google Scholar
Samuel Thibault
View author publications
You can also search for this author in PubMed Google Scholar
Raymond Namyst
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Insitute for Applied Mathematics, Delft University of Technology, 2628, Delft, The Netherlands
Hai-Xiang Lin
Scaledinfra technologies GmbH, Köllnerhofgasse 3/15A, 1010, Vienna, Austria
Michael Alexander
VTT, Kaitovayla 1, 90570, Oulu, Finland
Martti Forsell
Technische Universität Dresden, 01069, Dresden, Germany
Andreas Knüpfer
Institute for Computer Science, Technical University of Innsbruck, 6020, Innsbruck, Austria
Radu Prodan
Instituto Superior Técnico/INESC-ID., Rua Alves Redol 9, 1000-029, Lisbon, Portugal
Leonel Sousa
Jülich Supercomputing Centre, 52425, Jülich, Germany
Achim Streit

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Augonnet, C., Thibault, S., Namyst, R. (2010). Automatic Calibration of Performance Models on Heterogeneous Multicore Architectures. In: Lin, HX., et al. Euro-Par 2009 – Parallel Processing Workshops. Euro-Par 2009. Lecture Notes in Computer Science, vol 6043. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14122-5_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-14122-5_9
Published: 17 June 2010
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14121-8
Online ISBN: 978-3-642-14122-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Automatic Calibration of Performance Models on Heterogeneous Multicore Architectures

Abstract

Chapter PDF

Similar content being viewed by others

Multicore Performance Prediction with MPET

Catwalk: A Quick Development Path for Performance Models

Performance Patterns and Hardware Metrics on Modern Multicore Processors: Best Practices for Performance Engineering

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Automatic Calibration of Performance Models on Heterogeneous Multicore Architectures

Abstract

Chapter PDF

Similar content being viewed by others

Multicore Performance Prediction with MPET

Catwalk: A Quick Development Path for Performance Models

Performance Patterns and Hardware Metrics on Modern Multicore Processors: Best Practices for Performance Engineering

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation