Parallel Programming Models

Chapter

Abstract

The parallel programming landscape is constantly changing and becoming enriched with new languages, tools and techniques. In this chapter we give a survey on the different parallel programming models available today. These models are suitable for general-purpose computing but are also suitable for programming specialized (e.g. embedded) systems that offer the required facilities. We start by categorizing the available models according to the memory abstraction they discuss to the programmer and then present the representative styles and languages in each category. We cover shared-memory models, distributed-memory models, models for devices with private memory spaces such as gpus and accelerators, as well as models that combine the aforementioned ones in some manner. We conclude with a look towards some other models that do not fall directly in the above categories, which however have a significance of their own.

Keywords

Titanium Coherence Prefix eSkel 

References

  1. 1.
    OpenMP ARB, “OpenMP Application Program Interface V3.1,” July 2011.Google Scholar
  2. 2.
    J. Protic, M. Tomasevic, and V. Milutinovic, “Distributed shared memory: Concepts and systems,” IEEE Concurrency, vol. 4, pp. 63–79, 1996.Google Scholar
  3. 3.
    K. Asanovic, R. Bodik, B. C. Catanzaro, J. J. Gebis, P. Husbands, K. Keutzer, D. A. Patterson, W. L. Plishker, J. Shalf, S. W. Williams, and K. A. Yelick, “The landscape of parallel computing research: A view from Berkeley,” EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2006-183, Dec 2006.Google Scholar
  4. 4.
    IEEE, “IEEE P1003.1c/D10: Draft standard for information technology – Portable Operating System Interface (POSIX),” Sept 1994.Google Scholar
  5. 5.
    D. Butenhof, Programming With Posix Threads, ser. Addison-Wesley Professional Computing Series. Addison-Wesley, 1997.Google Scholar
  6. 6.
    IEEE, “IEEE Std 1003.1j-2000: Standard for information technology – Portable Operating System Interface (POSIX)- part 1: System Application Program Interface (API)-Amendment J: Advanced real-time extensions,” pp. 1–88, 2000.Google Scholar
  7. 7.
    OpenMP ARB, “OpenMP Application Program Interface V3.0,” May 2008.Google Scholar
  8. 8.
    F. Liu and V. Chaudhary, “A practical OpenMP compiler for system on chips,” in WOMPAT ’03, Int’l Workshop on OpenMP Applications and Tools, Toronto, Canada, 2003, pp. 54–68.Google Scholar
  9. 9.
    T. Hanawa, M. Sato, J. Lee, T. Imada, H. Kimura, and T. Boku, “Evaluation of multicore processors for embedded systems by parallel benchmark program using OpenMP,” in IWOMP 2009, 5th International Workshop on OpenMP, Dresden, Germany, June 2009, pp. 15–27.Google Scholar
  10. 10.
    S. N. Agathos, V. V. Dimakopoulos, A. Mourelis, and A. Papadogiannakis, “Deploying OpenMP on an embedded multicore accelerator,” in SAMOS XIII, 13th Int’l Conference on Embedded Computer Systems: Architectures, MOdeling, and Simulation, Samos, Greece, July 2013.Google Scholar
  11. 11.
    P. Vander Wolf, E. deKock, T. Henriksson, W. Kruijtzer, and G. Essink, “Design and programming of embedded multiprocessors: an interface-centric approach,” in Proc. CODES+ISSS ’04, 2nd IEEE/ACM/IFIP Int’l Conference on Hardware/software Codesign and System Synthesis, New York, USA, 2004, pp. 206–217.Google Scholar
  12. 12.
    P. Paulin, C. Pilkington, M. Langevin, E. Bensoudane, and G. Nicolescu, “Parallel programming models for a multi-processor SoC platform applied to high-speed traffic management,” in Proc. CODES+ISSS ’04, 2nd IEEE/ACM/IFIP Int’l Conference on Hardware/software Codesign and System Synthesis, New York, USA, 2004, pp. 48–53.Google Scholar
  13. 13.
    W. Gropp, E. Lusk, and A. Skjellum, Using MPI: Portable Parallel Programming with the Message Passing Interface, 2nd edition. Cambridge, MA: MIT Press, 1999.Google Scholar
  14. 14.
    W. Gropp, E. Lusk, and R. Thakur, Using MPI-2: Advanced Features of the Message-Passing Interface. Cambridge, MA: MIT Press, 1999.Google Scholar
  15. 15.
    A. Agbaria, D.-I. Kang, and K. Singh, “LMPI: MPI for heterogeneous embedded distributed systems,” in ICPADS ’06, 12th International Conference on Parallel and Distributed Systems - Volume 1. Washington, DC, USA: IEEE Computer Society, 2006, pp. 79–86.Google Scholar
  16. 16.
    L. Benini, E. Flamand, D. Fuin, and D. Melpignano, “P2012: Building an ecosystem for a scalable, modular and high-efficiency embedded computing accelerator,” in Design, Automation Test in Europe Conference Exhibition (DATE), 2012, 2012, pp. 983–987.Google Scholar
  17. 17.
    NVIDIA, NVIDIA CUDA Programming Guide 2.0, 2008.Google Scholar
  18. 18.
    D. B. Kirk and W.-m. W. Hwu, Programming Massively Parallel Processors: A Hands-on Approach, 1st ed. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2010.Google Scholar
  19. 19.
    Khronos Group, The OpenCL Specification Version 1.0, Beaverton, OR, 2009.Google Scholar
  20. 20.
    E. Ayguadé, R. M. Badia, F. D. Igual, J. Labarta, R. Mayo, and E. S. Quintana-Ortí, “An extension of the StarSs programming model for platforms with multiple GPUs,” in Euro-Par ’09, 15th International Euro-Par Conference on Parallel Processing. Berlin, Heidelberg: Springer-Verlag, 2009, pp. 851–862.Google Scholar
  21. 21.
    R. Dolbeau, S. Bihan, and F. Bodin, “HMPP: A hybrid multi-core parallel programming environment,” in GPGPU 2007, Workshop on General Purpose Processing on Graphics Processing Units, 2007.Google Scholar
  22. 22.
    OpenACC, The OpenACC TM Application Programming Interface Version 1.0, Nov 2011.Google Scholar
  23. 23.
    Y. Wang, Z. Feng, H. Guo, C. He, and Y. Yang, “Scene recognition acceleration using CUDA and OpenMP,” in ICISE ’09, 1st International Conference on Information Science and Engineering, 2009, pp. 1422–1425.Google Scholar
  24. 24.
    Q.-k. Chen and J.-k. Zhang, “A stream processor cluster architecture model with the hybrid technology of MPI and CUDA,” in ICISE, 2009, 1st International Conference on Information Science and Engineering, 2009, pp. 86–89.Google Scholar
  25. 25.
    C. Wright, “Hybrid programming fun: Making bzip2 parallel with MPICH2 & Pthreads on the Cray XD1,” in CUG ’06, 48th Cray User Group Conference, 2006.Google Scholar
  26. 26.
    W. Pfeiffer and A. Stamatakis, “Hybrid MPI / Pthreads parallelization of the RAxML phylogenetics code,” in 9th IEEE International Workshop on High Performance Computational Biology, Atlanta, GA, Apr 2010.Google Scholar
  27. 27.
    L. Smith and M. Bull, “Development of mixed mode MPI / OpenMP applications,” Scientific Programming, vol. 9, no. 2,3, pp. 83–98, Aug. 2001.Google Scholar
  28. 28.
    R. Rabenseifner, “Hybrid parallel programming on HPC platforms,” in EWOMP ’03, 5th European Workshop on OpenMP, Aachen, Germany, Sept 2003, pp. 185–194.Google Scholar
  29. 29.
    B. Estrade, “Hybrid programming with MPI and OpenMP,” in High Performance Computing Workshop, 2009.Google Scholar
  30. 30.
    F. Cappello and D. Etiemble, “MPI versus MPI+OpenMP on IBM SP for the NAS benchmarks,” in SC ’00, ACM/IEEE Conference on Supercomputing, Dallas, Texas, USA, 2000.Google Scholar
  31. 31.
    D. S. Henty, “Performance of hybrid message-passing and shared-memory parallelism for discrete element modeling,” in SC ’00, ACM/IEEE Conference on Supercomputing, Dallas, Texas, USA, 2000.Google Scholar
  32. 32.
    K. Nakajima, “Parallel iterative solvers for finite-element methods using an OpenMP/MPI hybrid programming model on the earth simulator,” Parallel Computing, vol. 31, no. 10–12, pp. 1048–1065, Oct. 2005.Google Scholar
  33. 33.
    R. Aversa, B. Di Martino, M. Rak, S. Venticinque, and U. Villano, “Performance prediction through simulation of a hybrid MPI/OpenMP application,” Parallel Comput., vol. 31, no. 10–12, pp. 1013–1033, Oct. 2005.Google Scholar
  34. 34.
    P. D. Mininni, D. Rosenberg, R. Reddy, and A. Pouquet, “A hybrid MPI-OpenMP scheme for scalable parallel pseudospectral computations for fluid turbulence,” Parallel Computing, vol. 37, no. 6–7, pp. 316–326, 2011.CrossRefGoogle Scholar
  35. 35.
    V. V. Dimakopoulos and P. E. Hadjidoukas, “HOMPI: A hybrid programming framework for expressing and deploying task-based parallelism,” in Euro-Par’11, 17th International Conference on Parallel processing, Bordeaux, France, Aug 2011, pp. 14–26.Google Scholar
  36. 36.
    R. W. Numrich and J. Reid, “Co-arrays in the next Fortran standard,” SIGPLAN Fortran Forum, vol. 24, no. 2, pp. 4–17, Aug. 2005.Google Scholar
  37. 37.
    UPC Consortium, “UPC language specifications, v1.2,” Lawrence Berkeley National Lab, Tech. Rep. LBNL-59208, 2005.Google Scholar
  38. 38.
    M. Frigo, C. E. Leiserson, and K. H. Randall, “The implementation of the Cilk-5 multithreaded language,” in PLDI ’98, ACM SIGPLAN 1998 conference on Programming language design and implementation, Montreal, Quebec, Canada, 1998, pp. 212–223.Google Scholar
  39. 39.
    K. Fatahalian, D. R. Horn, T. J. Knight, L. Leem, M. Houston, J. Y. Park, M. Erez, M. Ren, A. Aiken, W. J. Dally, and P. Hanrahan, “Sequoia: programming the memory hierarchy,” in SC ’06, 2006 ACM/IEEE Conference on Supercomputing, Tampa, Florida, 2006.Google Scholar
  40. 40.
    G. Bikshandi, J. Guo, D. Hoeflinger, G. Almasi, B. B. Fraguela, M. J. Garzarán, D. Padua, and C. von Praun, “Programming for parallelism and locality with hierarchically tiled arrays,” in PPoPP ’06, 11th ACM SIGPLAN symposium on Principles and practice of parallel programming, New York, New York, USA, 2006, pp. 48–57.Google Scholar
  41. 41.
    A. Shafi, B. Carpenter, and M. Baker, “Nested parallelism for multi-core HPC systems using Java,” J. Parallel Distrib. Comput., vol. 69, no. 6, pp. 532–545, Jun. 2009.Google Scholar
  42. 42.
    K. A. Yelick, L. Semenzato, G. Pike, C. Miyamoto, B. Liblit, A. Krishnamurthy, P. N. Hilfinger, S. L. Graham, D. Gay, P. Colella, and A. Aiken, “Titanium: A high-performance Java dialect,” Concurrency: Practice and Experience, vol. 10, no. 11–13, pp. 825–836, 1998.CrossRefGoogle Scholar
  43. 43.
    M. Cole, Algorithmic skeletons: structured management of parallel computation. London: Pitman / MIT Press, 1989.MATHGoogle Scholar
  44. 44.
    M. Cole, “Bringing skeletons out of the closet: a pragmatic manifesto for skeletal parallel programming,” Parallel Computing, vol. 30, no. 3, pp. 389–406, 2004.CrossRefGoogle Scholar
  45. 45.
    M. Danelutto and M. Stigliani, “SKElib: parallel programming with skeletons in C,” in Proc. of 6th Intl. Euro-Par 2000 Parallel Processing, ser. LNCS, A. Bode, T. Ludwing, W. Karl, and R. Wismüller, Eds., vol. 1900. Munich, Germany: Springer, Aug. 2000, pp. 1175–1184.Google Scholar
  46. 46.
    M. Vanneschi, “The programming model of ASSIST, an environment for parallel and distributed portable applications,” Parallel Computing, vol. 28, no. 12, pp. 1709–1732, 2002.CrossRefMATHGoogle Scholar
  47. 47.
    M. Leyton and J. M. Piquer, “Skandium: Multi-core programming with algorithmic skeletons,” in PDP ’10, 18th Euromicro Int’l Conference on Parallel, Distributed and Network-Based Processing, 2010, pp. 289–296.Google Scholar
  48. 48.
    M. Aldinucci, M. Danelutto, P. Kilpatrick, and M. Torquati, “FastFlow: high-level and efficient streaming on multi-core,” in Programming Multi-core and Many-core Computing Systems, ser. Parallel and Distributed Computing, S. Pllana and F. Xhafa, Eds. Wiley, Jan. 2013, ch. 13.Google Scholar
  49. 49.
    H. González-Vélez and M. Leyton, “A survey of algorithmic skeleton frameworks: high-level structured parallel programming enablers,” Software: Practice and Experience, vol. 40, no. 12, pp. 1135–1160, 2010.CrossRefGoogle Scholar
  50. 50.
    H. Sutter and J. Larus, “Software and the concurrency revolution,” ACM Queue, vol. 3, no. 7, pp. 54–62, Sep. 2005.Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringUniversity of IoanninaIoanninaGreece

Personalised recommendations