Abstract
Limits on applications and hardware technologies have put a stop to the frequency race during the 2000s. Designs now can be divided into homogeneous and heterogeneous ones. Homogeneous types are the easiest to use since most toolchains and system software do not need too much of a rewrite. On the other end of the spectrum, there are the type two heterogeneous designs. These designs offer tremendous computational raw power, but at the cost of hardware features that might be necessary or even essential for certain types of system software and programming languages. An example of this architectural design is the Cell processor which exhibits both a heavy core and a group of simple cores designed as a computational engine. Even though the Cell processor is very well known for its accomplishments, it is also well known for its low programmability. Among many efforts to increase its programmability, there is the Open OPELL project. This framework tries to port the OpenMP programming model to the Cell architecture. The OPELL framework is composed of four components: a single source toolchain, a very light SPU kernel, a software cache and a partition / code overlay manager. To reduce the overhead, each of these components can be further optimized. This paper concentrates on optimizing the partition manager by reducing the number of long latency transactions. The contributions of this work are as follows.
-
1
The development of a dynamic framework that loads and manages partitions across function calls to bypass the problem with restrictive memory spaces.
-
2
The implementation of replacement policies that are useful to reduce the number of DMA calls across partitions.
-
3
A quantification of such replacement policies given a selected set of applications
-
4
An API which can be easily ported and extended to several types of architectures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
CBE Architectural Manual
Bellens, P., Perez, J.M., Badia, R.M., Labarta, J.: Cellss: a programming model for the cell be architecture. In: ACM/IEEE Conference on Supercomputing, p. 86. ACM (2006)
Caubet, J.: Programming ibm powerxcell 8i/qs22 libspe2, alf, dacs (May 2009)
Chen, C., Manzano, J.B., Gan, G., Gao, G.R., Sarkar, V.: A Study of a Software Cache Implementation of the OpenMP Memory Model for Multicore and Manycore Architectures. In: D’Ambra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010, Part II. LNCS, vol. 6272, pp. 341–352. Springer, Heidelberg (2010)
del Cuvillo, J., Zhu, W., Hu, Z., Gao, G.R.: Tiny threads: A thread virtual machine for the cyclops64 cellular architecture. In: International Parallel and Distributed Processing Symposium, vol. 15, p. 265b (2005)
Manzano, J.B., Hu, Z., Jiang, Y., Gan, G., Song, H.-J., Park, J.-G.: Toward an Automatic Code Layout Methodology. In: Chapman, B., Zheng, W., Gao, G.R., Sato, M., Ayguadé, E., Wang, D. (eds.) IWOMP 2007. LNCS, vol. 4935, pp. 157–160. Springer, Heidelberg (2008)
O’Brien, K., O’Brien, K., Sura, Z., Chen, T., Zhang, T.: Supporting openmp on cell. Int. J. Parallel Program. 36, 289–311 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Manzano, J.B., Gan, G., Ributzka, J., Shrestha, S., Gao, G.R. (2013). OPELL and PM: A Case Study on Porting Shared Memory Programming Models to Accelerators Architectures. In: Rajopadhye, S., Mills Strout, M. (eds) Languages and Compilers for Parallel Computing. LCPC 2011. Lecture Notes in Computer Science, vol 7146. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36036-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-36036-7_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36035-0
Online ISBN: 978-3-642-36036-7
eBook Packages: Computer ScienceComputer Science (R0)