Skip to main content

Automatic Streamization of Image Processing Applications

  • Conference paper
  • First Online:
Languages and Compilers for Parallel Computing (LCPC 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8967))

  • 883 Accesses

Abstract

New many-core architectures such as the Kalray MPPA-256 provide energy-efficiency and high performance for embedded systems. However, to take advantage of these opportunities, careful manual optimizations are required. We investigate the automatic streamization of image processing applications, implemented in C on top of a dedicated API, onto this target accessed through the \(\varSigma \mathrm{C}\) dataflow language. We discuss compiler and runtime design choices and their impact on performance. Our compilation techniques are implemented as source-to-source transformations in the PIPS open-source compilation framework. Experiments show lowest energy consumption on the Kalray MPPA target compared to other hardware targets for a range of 8 test applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. The streamit language (2002). http://www.cag.lcs.mit.edu/streamit/

  2. Tera-scale architecture (2008). https://www-asim.lip6.fr/trac/tsar/wiki

  3. The TilePro64 many-core architecture (2008). http://www.tilera.com/

  4. The Epiphany many-core architecture (2012). http://www.adapteva.com/

  5. Aubry, P., Beaucamps, P.E., Blanc, F., Bodin, B., Carpov, S., Cudennec, L., David, V., Dore, P., Dubrulle, P., Dupont de Dinechin, B., Galea, F., Goubier, T., Harrand, M., Jones, S., Lesage, J.D., Louise, S., Chaisemartin, N.M., Nguyen, T.H., Raynaud, X., Sirdey, R.: Extended cyclostatic dataflow program compilation and execution for an integrated manycore processor. In: Alexandrov, V.N., Lees, M., Krzhizhanovskaya, V.V., Dongarra, J., Sloot, P.M.A. (eds.) ICCS. Procedia Computer Science, pp. 1624–1633. Elsevier, Amsterdam (2013)

    Google Scholar 

  6. Bandishti, V., Pananilath, I., Bondhugula, U.: Tiling stencil computations to maximize parallelism, November 2012

    Google Scholar 

  7. Bilodeau, M., Clienti, C., Coelho, F., Guelton, S., Irigoin, F., Keryell, R., Lemonnier, F.: FREIA: Framework for Embedded Image Applications (2008–2011). freia.enstb.org, French ANR-funded project with ARMINES (CMM, CRI), THALES (TRT) and Télécom Bretagne

  8. Bonnot, P., Lemonnier, F., Edelin, G., Gaillat, G., Ruch, O., Gauget, P.: Definition and SIMD implementation of a multi-processing architecture approach on FPGA. In: Design Automation and Test in Europe, pp. 610–615. IEEE, December 2008

    Google Scholar 

  9. Bosilca, G., Bouteiller, A., Danalis, A., Herault, T., Lemarinier, P., Dongarra, J.: DAGuE: a generic distributed DAG engine for high performance computing. Parallel Comput. 38(1–2), 37–51 (2012). http://linkinghub.elsevier.com/retrieve/pii/S0167819111001347

    Article  Google Scholar 

  10. Chambers, C., Raniwala, A., Perry, F., Adams, S., Henry, R.R., Bradshaw, R., Weizenbaum, N.: FlumeJava: easy, efficient data-parallel pipelines, p. 363. ACM Press (2010). http://portal.acm.org/citation.cfm?doid=1806596.1806638

  11. Clienti, C.: Fulguro image processing library. Source Forge (2008)

    Google Scholar 

  12. Clienti, C.: Architectures flots de données dédiées autraitement d’images par la Morphologie MATHÉMATIQUE. Ph.D. thesis, MINES ParisTech, September 2009

    Google Scholar 

  13. Clienti, C., Beucher, S., Bilodeau, M.: A system on chip dedicated to pipeline neighborhood processing for mathematical morphology. In: EUSIPCO: European Signal Processing Conference, August 2008

    Google Scholar 

  14. Coelho, F., Irigoin, F.: API compilation for image hardware accelerators. ACM Trans. Archit. Code Optim. 9(4), 1–25 (2013)

    Article  Google Scholar 

  15. CRI, MINES ParisTech: PIPS (1989–2012). pips4u.org, open source research compiler, under GPLv3

  16. Datta, K., Murphy, M., Volkov, V., Williams, S., Carter, J., Oliker, L., Patterson, D., Shalf, J., Yelick, K.: Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In: SC 2008: Conference on Supercomputing, pp. 1–12. IEEE Press (2008)

    Google Scholar 

  17. Dupont de Dinechin, B., Sirdey, R., Goubier, T.: Extended cyclostatic dataflow program compilation and execution for an integrated manycore processor. In: Procedia Computer Science, vol. 18 (2013)

    Google Scholar 

  18. Gordon, M.I., Thies, W., Amarasinghe, S.: Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. ACM SIGPLAN Not. 41(11), 151 (2006). http://portal.acm.org/citation.cfm?doid=1168918.1168877

  19. Goubier, T., Sirdey, R., Louise, S., David, V.: \(\Sigma \)C: a programming model and language for embedded manycores. In: Xiang, Y., Cuzzocrea, A., Hobbs, M., Zhou, W. (eds.) ICA3PP 2011, Part I. LNCS, vol. 7016, pp. 385–394. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  20. Halbwachs, N., Caspi, P., Raymond, P., Pilaud, D.: The synchronous dataflow programming language LUSTRE. Proc. IEEE 79(9), 1305–1320 (1991)

    Article  Google Scholar 

  21. Irigoin, F., Jouvelot, P., Triolet, R.: Semantical interprocedural parallelization: an overview of the PIPS project. In: Proceedings of ICS 1991, pp. 244–251. ACM Press (1991)

    Google Scholar 

  22. Kahn, G.: The semantics of a simple language for parallel programming. p. 5 (1974)

    Google Scholar 

  23. KHRONOS group: OpenCL computing language v1.0, December 2008

    Google Scholar 

  24. Le Guernic, P., Benveniste, A., Bournai, P., Gautier, T.: Signal-a data flow-oriented language for signal processing. IEEE Trans. Acoust. Speech Signal Process. 34(2), 362–374 (1986)

    Article  Google Scholar 

  25. Lee, E.A., Messerschmitt, D.G.: Static scheduling of synchronous data flow programs for digital signal processing. IEEE Trans. Comput. 36(1), 24–35 (1987)

    Article  MATH  Google Scholar 

  26. Murthy, P.K., Lee, E.A.: Multidimensional synchronous dataflow. IEEE Trans. Signal Process. 50, 3306–3309 (2002)

    Article  Google Scholar 

  27. OpenMP architecture review board: OpenMP application program interface, Version 3.0, May 2008

    Google Scholar 

  28. Pop, A.: Leveraging streaming for deterministic parallelization - an integrated language, compiler and runtime approach. Ph.D. thesis, MINES ParisTech, September 2011

    Google Scholar 

  29. Ragan-Kelley, J., Barnes, C., Adams, A., Paris, S., Durand, F., Amarasinghe, S.: Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. In: PLDI 2013, p. 12, June 2013

    Google Scholar 

  30. Soile, P.: Morphological Image Analysis. Springer, Heidelberg (2003)

    Google Scholar 

  31. Stephens, R.: A Survey Of Stream Processing. Springer, Heidelberg (1995)

    Google Scholar 

Download references

Aknowledgements

Thanks to Danielle Bolan and Pierre Jouvelot for proof-reading, Antoniu Pop for his advises and bibliography pointers, Kalray engineers Frédéric Blanc, Jérôme Bussery and Stéphane Gailhard for their support and to anonymous reviewers whose comments greatly helped to improve this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pierre Guillou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Guillou, P., Coelho, F., Irigoin, F. (2015). Automatic Streamization of Image Processing Applications. In: Brodman, J., Tu, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2014. Lecture Notes in Computer Science(), vol 8967. Springer, Cham. https://doi.org/10.1007/978-3-319-17473-0_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-17473-0_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-17472-3

  • Online ISBN: 978-3-319-17473-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics