Abstract
New many-core architectures such as the Kalray MPPA-256 provide energy-efficiency and high performance for embedded systems. However, to take advantage of these opportunities, careful manual optimizations are required. We investigate the automatic streamization of image processing applications, implemented in C on top of a dedicated API, onto this target accessed through the \(\varSigma \mathrm{C}\) dataflow language. We discuss compiler and runtime design choices and their impact on performance. Our compilation techniques are implemented as source-to-source transformations in the PIPS open-source compilation framework. Experiments show lowest energy consumption on the Kalray MPPA target compared to other hardware targets for a range of 8 test applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
The streamit language (2002). http://www.cag.lcs.mit.edu/streamit/
Tera-scale architecture (2008). https://www-asim.lip6.fr/trac/tsar/wiki
The TilePro64 many-core architecture (2008). http://www.tilera.com/
The Epiphany many-core architecture (2012). http://www.adapteva.com/
Aubry, P., Beaucamps, P.E., Blanc, F., Bodin, B., Carpov, S., Cudennec, L., David, V., Dore, P., Dubrulle, P., Dupont de Dinechin, B., Galea, F., Goubier, T., Harrand, M., Jones, S., Lesage, J.D., Louise, S., Chaisemartin, N.M., Nguyen, T.H., Raynaud, X., Sirdey, R.: Extended cyclostatic dataflow program compilation and execution for an integrated manycore processor. In: Alexandrov, V.N., Lees, M., Krzhizhanovskaya, V.V., Dongarra, J., Sloot, P.M.A. (eds.) ICCS. Procedia Computer Science, pp. 1624–1633. Elsevier, Amsterdam (2013)
Bandishti, V., Pananilath, I., Bondhugula, U.: Tiling stencil computations to maximize parallelism, November 2012
Bilodeau, M., Clienti, C., Coelho, F., Guelton, S., Irigoin, F., Keryell, R., Lemonnier, F.: FREIA: Framework for Embedded Image Applications (2008–2011). freia.enstb.org, French ANR-funded project with ARMINES (CMM, CRI), THALES (TRT) and Télécom Bretagne
Bonnot, P., Lemonnier, F., Edelin, G., Gaillat, G., Ruch, O., Gauget, P.: Definition and SIMD implementation of a multi-processing architecture approach on FPGA. In: Design Automation and Test in Europe, pp. 610–615. IEEE, December 2008
Bosilca, G., Bouteiller, A., Danalis, A., Herault, T., Lemarinier, P., Dongarra, J.: DAGuE: a generic distributed DAG engine for high performance computing. Parallel Comput. 38(1–2), 37–51 (2012). http://linkinghub.elsevier.com/retrieve/pii/S0167819111001347
Chambers, C., Raniwala, A., Perry, F., Adams, S., Henry, R.R., Bradshaw, R., Weizenbaum, N.: FlumeJava: easy, efficient data-parallel pipelines, p. 363. ACM Press (2010). http://portal.acm.org/citation.cfm?doid=1806596.1806638
Clienti, C.: Fulguro image processing library. Source Forge (2008)
Clienti, C.: Architectures flots de données dédiées autraitement d’images par la Morphologie MATHÉMATIQUE. Ph.D. thesis, MINES ParisTech, September 2009
Clienti, C., Beucher, S., Bilodeau, M.: A system on chip dedicated to pipeline neighborhood processing for mathematical morphology. In: EUSIPCO: European Signal Processing Conference, August 2008
Coelho, F., Irigoin, F.: API compilation for image hardware accelerators. ACM Trans. Archit. Code Optim. 9(4), 1–25 (2013)
CRI, MINES ParisTech: PIPS (1989–2012). pips4u.org, open source research compiler, under GPLv3
Datta, K., Murphy, M., Volkov, V., Williams, S., Carter, J., Oliker, L., Patterson, D., Shalf, J., Yelick, K.: Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In: SC 2008: Conference on Supercomputing, pp. 1–12. IEEE Press (2008)
Dupont de Dinechin, B., Sirdey, R., Goubier, T.: Extended cyclostatic dataflow program compilation and execution for an integrated manycore processor. In: Procedia Computer Science, vol. 18 (2013)
Gordon, M.I., Thies, W., Amarasinghe, S.: Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. ACM SIGPLAN Not. 41(11), 151 (2006). http://portal.acm.org/citation.cfm?doid=1168918.1168877
Goubier, T., Sirdey, R., Louise, S., David, V.: \(\Sigma \)C: a programming model and language for embedded manycores. In: Xiang, Y., Cuzzocrea, A., Hobbs, M., Zhou, W. (eds.) ICA3PP 2011, Part I. LNCS, vol. 7016, pp. 385–394. Springer, Heidelberg (2011)
Halbwachs, N., Caspi, P., Raymond, P., Pilaud, D.: The synchronous dataflow programming language LUSTRE. Proc. IEEE 79(9), 1305–1320 (1991)
Irigoin, F., Jouvelot, P., Triolet, R.: Semantical interprocedural parallelization: an overview of the PIPS project. In: Proceedings of ICS 1991, pp. 244–251. ACM Press (1991)
Kahn, G.: The semantics of a simple language for parallel programming. p. 5 (1974)
KHRONOS group: OpenCL computing language v1.0, December 2008
Le Guernic, P., Benveniste, A., Bournai, P., Gautier, T.: Signal-a data flow-oriented language for signal processing. IEEE Trans. Acoust. Speech Signal Process. 34(2), 362–374 (1986)
Lee, E.A., Messerschmitt, D.G.: Static scheduling of synchronous data flow programs for digital signal processing. IEEE Trans. Comput. 36(1), 24–35 (1987)
Murthy, P.K., Lee, E.A.: Multidimensional synchronous dataflow. IEEE Trans. Signal Process. 50, 3306–3309 (2002)
OpenMP architecture review board: OpenMP application program interface, Version 3.0, May 2008
Pop, A.: Leveraging streaming for deterministic parallelization - an integrated language, compiler and runtime approach. Ph.D. thesis, MINES ParisTech, September 2011
Ragan-Kelley, J., Barnes, C., Adams, A., Paris, S., Durand, F., Amarasinghe, S.: Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. In: PLDI 2013, p. 12, June 2013
Soile, P.: Morphological Image Analysis. Springer, Heidelberg (2003)
Stephens, R.: A Survey Of Stream Processing. Springer, Heidelberg (1995)
Aknowledgements
Thanks to Danielle Bolan and Pierre Jouvelot for proof-reading, Antoniu Pop for his advises and bibliography pointers, Kalray engineers Frédéric Blanc, Jérôme Bussery and Stéphane Gailhard for their support and to anonymous reviewers whose comments greatly helped to improve this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Guillou, P., Coelho, F., Irigoin, F. (2015). Automatic Streamization of Image Processing Applications. In: Brodman, J., Tu, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2014. Lecture Notes in Computer Science(), vol 8967. Springer, Cham. https://doi.org/10.1007/978-3-319-17473-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-17473-0_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17472-3
Online ISBN: 978-3-319-17473-0
eBook Packages: Computer ScienceComputer Science (R0)