Automatic Streamization of Image Processing Applications

Guillou, Pierre; Coelho, Fabien; Irigoin, François

doi:10.1007/978-3-319-17473-0_15

Pierre Guillou¹⁵,
Fabien Coelho¹⁵ &
François Irigoin¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8967))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

883 Accesses

Abstract

New many-core architectures such as the Kalray MPPA-256 provide energy-efficiency and high performance for embedded systems. However, to take advantage of these opportunities, careful manual optimizations are required. We investigate the automatic streamization of image processing applications, implemented in C on top of a dedicated API, onto this target accessed through the \(\varSigma \mathrm{C}\) dataflow language. We discuss compiler and runtime design choices and their impact on performance. Our compilation techniques are implemented as source-to-source transformations in the PIPS open-source compilation framework. Experiments show lowest energy consumption on the Kalray MPPA target compared to other hardware targets for a range of 8 test applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

The streamit language (2002). http://www.cag.lcs.mit.edu/streamit/
Tera-scale architecture (2008). https://www-asim.lip6.fr/trac/tsar/wiki
The TilePro64 many-core architecture (2008). http://www.tilera.com/
The Epiphany many-core architecture (2012). http://www.adapteva.com/
Aubry, P., Beaucamps, P.E., Blanc, F., Bodin, B., Carpov, S., Cudennec, L., David, V., Dore, P., Dubrulle, P., Dupont de Dinechin, B., Galea, F., Goubier, T., Harrand, M., Jones, S., Lesage, J.D., Louise, S., Chaisemartin, N.M., Nguyen, T.H., Raynaud, X., Sirdey, R.: Extended cyclostatic dataflow program compilation and execution for an integrated manycore processor. In: Alexandrov, V.N., Lees, M., Krzhizhanovskaya, V.V., Dongarra, J., Sloot, P.M.A. (eds.) ICCS. Procedia Computer Science, pp. 1624–1633. Elsevier, Amsterdam (2013)
Google Scholar
Bandishti, V., Pananilath, I., Bondhugula, U.: Tiling stencil computations to maximize parallelism, November 2012
Google Scholar
Bilodeau, M., Clienti, C., Coelho, F., Guelton, S., Irigoin, F., Keryell, R., Lemonnier, F.: FREIA: Framework for Embedded Image Applications (2008–2011). freia.enstb.org, French ANR-funded project with ARMINES (CMM, CRI), THALES (TRT) and Télécom Bretagne
Bonnot, P., Lemonnier, F., Edelin, G., Gaillat, G., Ruch, O., Gauget, P.: Definition and SIMD implementation of a multi-processing architecture approach on FPGA. In: Design Automation and Test in Europe, pp. 610–615. IEEE, December 2008
Google Scholar
Bosilca, G., Bouteiller, A., Danalis, A., Herault, T., Lemarinier, P., Dongarra, J.: DAGuE: a generic distributed DAG engine for high performance computing. Parallel Comput. 38(1–2), 37–51 (2012). http://linkinghub.elsevier.com/retrieve/pii/S0167819111001347
Article Google Scholar
Chambers, C., Raniwala, A., Perry, F., Adams, S., Henry, R.R., Bradshaw, R., Weizenbaum, N.: FlumeJava: easy, efficient data-parallel pipelines, p. 363. ACM Press (2010). http://portal.acm.org/citation.cfm?doid=1806596.1806638
Clienti, C.: Fulguro image processing library. Source Forge (2008)
Google Scholar
Clienti, C.: Architectures flots de données dédiées autraitement d’images par la Morphologie MATHÉMATIQUE. Ph.D. thesis, MINES ParisTech, September 2009
Google Scholar
Clienti, C., Beucher, S., Bilodeau, M.: A system on chip dedicated to pipeline neighborhood processing for mathematical morphology. In: EUSIPCO: European Signal Processing Conference, August 2008
Google Scholar
Coelho, F., Irigoin, F.: API compilation for image hardware accelerators. ACM Trans. Archit. Code Optim. 9(4), 1–25 (2013)
Article Google Scholar
CRI, MINES ParisTech: PIPS (1989–2012). pips4u.org, open source research compiler, under GPLv3
Datta, K., Murphy, M., Volkov, V., Williams, S., Carter, J., Oliker, L., Patterson, D., Shalf, J., Yelick, K.: Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In: SC 2008: Conference on Supercomputing, pp. 1–12. IEEE Press (2008)
Google Scholar
Dupont de Dinechin, B., Sirdey, R., Goubier, T.: Extended cyclostatic dataflow program compilation and execution for an integrated manycore processor. In: Procedia Computer Science, vol. 18 (2013)
Google Scholar
Gordon, M.I., Thies, W., Amarasinghe, S.: Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. ACM SIGPLAN Not. 41(11), 151 (2006). http://portal.acm.org/citation.cfm?doid=1168918.1168877
Goubier, T., Sirdey, R., Louise, S., David, V.: \(\Sigma \)C: a programming model and language for embedded manycores. In: Xiang, Y., Cuzzocrea, A., Hobbs, M., Zhou, W. (eds.) ICA3PP 2011, Part I. LNCS, vol. 7016, pp. 385–394. Springer, Heidelberg (2011)
Chapter Google Scholar
Halbwachs, N., Caspi, P., Raymond, P., Pilaud, D.: The synchronous dataflow programming language LUSTRE. Proc. IEEE 79(9), 1305–1320 (1991)
Article Google Scholar
Irigoin, F., Jouvelot, P., Triolet, R.: Semantical interprocedural parallelization: an overview of the PIPS project. In: Proceedings of ICS 1991, pp. 244–251. ACM Press (1991)
Google Scholar
Kahn, G.: The semantics of a simple language for parallel programming. p. 5 (1974)
Google Scholar
KHRONOS group: OpenCL computing language v1.0, December 2008
Google Scholar
Le Guernic, P., Benveniste, A., Bournai, P., Gautier, T.: Signal-a data flow-oriented language for signal processing. IEEE Trans. Acoust. Speech Signal Process. 34(2), 362–374 (1986)
Article Google Scholar
Lee, E.A., Messerschmitt, D.G.: Static scheduling of synchronous data flow programs for digital signal processing. IEEE Trans. Comput. 36(1), 24–35 (1987)
Article MATH Google Scholar
Murthy, P.K., Lee, E.A.: Multidimensional synchronous dataflow. IEEE Trans. Signal Process. 50, 3306–3309 (2002)
Article Google Scholar
OpenMP architecture review board: OpenMP application program interface, Version 3.0, May 2008
Google Scholar
Pop, A.: Leveraging streaming for deterministic parallelization - an integrated language, compiler and runtime approach. Ph.D. thesis, MINES ParisTech, September 2011
Google Scholar
Ragan-Kelley, J., Barnes, C., Adams, A., Paris, S., Durand, F., Amarasinghe, S.: Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. In: PLDI 2013, p. 12, June 2013
Google Scholar
Soile, P.: Morphological Image Analysis. Springer, Heidelberg (2003)
Google Scholar
Stephens, R.: A Survey Of Stream Processing. Springer, Heidelberg (1995)
Google Scholar

Download references

Aknowledgements

Thanks to Danielle Bolan and Pierre Jouvelot for proof-reading, Antoniu Pop for his advises and bibliography pointers, Kalray engineers Frédéric Blanc, Jérôme Bussery and Stéphane Gailhard for their support and to anonymous reviewers whose comments greatly helped to improve this paper.

Author information

Authors and Affiliations

MINES ParisTech, PSL Research University, Paris, France
Pierre Guillou, Fabien Coelho & François Irigoin

Authors

Pierre Guillou
View author publications
You can also search for this author in PubMed Google Scholar
Fabien Coelho
View author publications
You can also search for this author in PubMed Google Scholar
François Irigoin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pierre Guillou .

Editor information

Editors and Affiliations

Intel Corporation, Santa Clara, California, USA
James Brodman
Intel Corporation, Santa Clara, California, USA
Peng Tu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Guillou, P., Coelho, F., Irigoin, F. (2015). Automatic Streamization of Image Processing Applications. In: Brodman, J., Tu, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2014. Lecture Notes in Computer Science(), vol 8967. Springer, Cham. https://doi.org/10.1007/978-3-319-17473-0_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-17473-0_15
Published: 01 May 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17472-3
Online ISBN: 978-3-319-17473-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics