The Abstract Streaming Machine: Compile-Time Performance Modelling of Stream Programs on Heterogeneous Multiprocessors

  • Paul M. Carpenter
  • Alex Ramirez
  • Eduard Ayguade
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5657)


Stream programming offers a portable way for regular applications such as digital video, software radio, multimedia and 3D graphics to exploit a multiprocessor machine. The compiler maps a portable stream program onto the target, automatically sizing communications buffers and applying optimizing transformations such as task fission or fusion, unrolling loops and aggregating communication. We present a machine description and performance model for an iterative stream compilation flow, which represents the stream program running on a heterogeneous multiprocessor system with distributed or shared memory. The model is a key component of the ACOTES open-source stream compiler currently under development. Our experiments on the Cell Broadband Engine show that the predicted throughput has a maximum relative error of 15% across our benchmarks.


Finite Impulse Response Maximum Relative Error Software Radio Simulated Trace Cell Processor 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Sutter, H., Larus, J.: Software and the concurrency revolution. Queue 3(7), 54–62 (2005)CrossRefGoogle Scholar
  2. 2.
    Parkhurst, J., Darringer, J., Grundmann, B.: From single core to multi-core: preparing for a new exponential. In: Proc. ICCAD 2006, pp. 67–72. ACM Press, New York (2006)Google Scholar
  3. 3.
    Chaoui, J., Cyr, K., Giacalone, J., Gregorio, S., Masse, Y., Muthusamy, Y., Spits, T., Budagavi, M., Webb, J.: OMAP: Enabling Multimedia Applications in Third Generation (3G) Wireless Terminals. In: SWPA001 (2000)Google Scholar
  4. 4.
    Chen, T., Raghavan, R., Dale, J., Iwata, E.: Cell Broadband Engine Architecture and its first implementation. IBM developer Works (2005)Google Scholar
  5. 5.
  6. 6.
    Thies, W., Karczmarek, M., Amarasinghe, S.: StreamIt: A Language for Streaming Applications. ICCC 4 (2002)Google Scholar
  7. 7.
    Lee, E., Messerschmitt, D.: Synchronous data flow. Proceedings of the IEEE 75(9), 1235–1245 (1987)CrossRefGoogle Scholar
  8. 8.
    Gummaraju, J., Rosenblum, M.: Stream Programming on General-Purpose Processors. In: Proc. MICRO 38, Barcelona, Spain (November 2005)Google Scholar
  9. 9.
    ACOTES IST-034869, Advanced Compiler Technologies for Embedded Streaming,
  10. 10.
    Balart, J., Duran, A., Gonzalez, M., Martorell, X., Ayguade, E., Labarta, J.: Nanos Mercurium: a Research Compiler for OpenMP. In: Proceedings of the European Workshop on OpenMP, vol. 2004 (2004)Google Scholar
  11. 11.
    Carpenter, P., Rodenas, D., Martorell, X., Ramirez, A., Ayguadé, E.: A streaming machine description and programming model. In: Vassiliadis, S., Bereković, M., Hämäläinen, T.D. (eds.) SAMOS 2007. LNCS, vol. 4599, pp. 107–116. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  12. 12.
    ACOTES: IST ACOTES Project Deliverable D2.2 Report on Streaming Programming Model and Abstract Streaming Machine Description Final Version (2008)Google Scholar
  13. 13.
    Stephens, R.: A survey of stream processing. Acta Informatica 34(7), 491–541 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Kahn, G.: The semantics of a simple language for parallel processing. Information Processing 74, 471–475 (1974)Google Scholar
  15. 15.
    Girona, S., Labarta, J., Badia, R.: Validation of Dimemas communication model for MPI collective operations. In: Dongarra, J., Kacsuk, P., Podhorszki, N. (eds.) PVM/MPI 2000. LNCS, vol. 1908, pp. 39–46. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  16. 16.
  17. 17.
    Ramsey, N., Davidson, J., Fernandez, M.: Design principles for machine-description languages. ACM Trans. Programming Languages and Systems (1998)Google Scholar
  18. 18.
    Labonte, F., Mattson, P., Thies, W., Buck, I., Kozyrakis, C., Horowitz, M.: The stream virtual machine. In: Proc. PACT 2004, pp. 267–277 (2004)Google Scholar
  19. 19.
    Mattson, P., Thies, W., Hammond, L., Vahey, M.: Streaming virtual machine specification 1.0. Technical report (2004),
  20. 20.
    Mattson, P.: PCA Machine Model, 1.0. Technical report (2004)Google Scholar
  21. 21.
    Kienhuis, B.: Design Space Exploration of Stream-based Dataflow Architectures: Methods and Tools. Delft University of Technology, Amsterdam, The Netherlands (1999)Google Scholar
  22. 22.
    Gordon, M., Thies, W., Amarasinghe, S.: Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. In: Proc. ASPLOS 2006, pp. 151–162 (2006)Google Scholar
  23. 23.
    Lundgren, W., Barnes, K., Steed, J.: Gedae: Auto Coding to a Virtual Machine. In: Proc. HPEC (2004)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2009

Authors and Affiliations

  • Paul M. Carpenter
    • 1
  • Alex Ramirez
    • 1
  • Eduard Ayguade
    • 1
  1. 1.Barcelona Supercomputing CenterBarcelonaSpain

Personalised recommendations