Evaluating Arithmetic Expressions Using Tree Contraction: A Fast and Scalable Parallel Implementation for Symmetric Multiprocessors (SMPs)

Extended Abstract
  • David A. Bader
  • Sukanya Sreshta
  • Nina R. Weisse-Bernstein
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2552)


The ability to provide uniform shared-memory access to a significant number of processors in a single SMP node brings us much closer to the ideal PRAM parallel computer. In this paper, we develop new techniques for designing a uniform shared-memory algorithm from a PRAM algorithm and present the results of an extensive experimental study demonstrating that the resulting programs scale nearly linearly across a significant range of processors and across the entire range of instance sizes tested. This linear speedup with the number of processors is one of the first ever attained in practice for intricate combinatorial problems. The example we present in detail here is for evaluating arithmetic expression trees using the algorithmic techniques of list ranking and tree contraction; this problem is not only of interest in its own right, but is representative of a large class of irregular combinatorial problems that have simple and efficient sequential implementations and fast PRAM algorithms, but have no known efficient parallel implementations. Our results thus offer promise for bridging the gap between the theory and practice of shared-memory parallel algorithms.


Expression Evaluation Tree Contraction Parallel Graph Algorithms Shared Memory High-Performance Algorithm Engineering 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    D.A. Bader and J. JáJá. SIMPLE: A methodology for programming high performance algorithms on clusters of symmetric multiprocessors (SMPs). J. Parallel & Distributed Comput., 58(1):92–108, 1999. 69CrossRefGoogle Scholar
  2. [2]
    D.A. Bader, A.K. Illendula, B.M.E. Moret, and N. Weisse-Bernstein. Using PRAM algorithms on a uniform-memory-access shared-memory architecture. In G. S. Brodal, D. Frigioni, and A. Marchetti-Spaccamela, eds., Proc. 5th Int’l Workshop on Algorithm Engineering (WAE 2001), volume 2141 of Lecture Notes in Computer Science, pages 129–144, Århus, Denmark, August 2001. Springer-Verlag. 65Google Scholar
  3. [3]
    E. Cáceres, F. Dehne, A. Ferreira, P. Flocchini, I. Rieping, A. Roncato, N. Santoro,and S.W. Song. Efficient parallel graph algorithms for coarse grained multicomputers and BSP. In Proc. 24th Int’l Colloquium on Automata, Languages and Programming (ICALP’97), volume 1256 of Lecture Notes in Computer Science, pages 390–400, Bologna, Italy, 1997. Springer-Verlag. 65Google Scholar
  4. [4]
    A. Charlesworth. Starfire: extending the SMP envelope. IEEE Micro, 18(1):39–49, 1998. 64CrossRefGoogle Scholar
  5. [5]
    A. Charlesworth. The Sun Fireplane system interconnect. In Proc. Supercomputing (SC 2001), pages 1–14, Denver, CO, November 2001. 64Google Scholar
  6. [6]
    R. Cole and U. Vishkin. The accelerated centroid decomposition technique for optimal parallel tree evaluation in logarithmic time. Algorithmica, 3:329–346, 1988. 65zbMATHCrossRefMathSciNetGoogle Scholar
  7. [7]
    F. Dehne, A. Ferreira, E. Cáceres, S.W. Song, and A. Roncato. Efficient parallel graph algorithms for coarse-grained multicomputers and BSP. Algorithmica, 33:183–200, 2002. 65zbMATHCrossRefMathSciNetGoogle Scholar
  8. [8]
    A.M. Gibbons and W. Rytter. An optimal parallel algorithm for dynamic expression evaluation and its applications. Information and Computation, 81:32–45, 1989. 65zbMATHCrossRefMathSciNetGoogle Scholar
  9. [9]
    B. Grayson, M. Dahlin, and V. Ramachandran. Experimental evaluation of QSM, a simple shared-memory model. In Proc. 13th Int’l Parallel Processing Symp. and 10th Symp. Parallel and Distributed Processing (IPPS/SPDP), pages 1–7, San Juan, Puerto Rico, April 1999. 65Google Scholar
  10. [10]
    D.R. Helman and J. JáJá. Designing practical efficient algorithms for symmetric multiprocessors. In Algorithm Engineering and Experimentation (ALENEX’99), volume 1619 of Lecture Notes in Computer Science, pages 37–56, Baltimore, MD, January 1999. Springer-Verlag. 70, 71CrossRefGoogle Scholar
  11. [11]
    T.-S. Hsu and V. Ramachandran. Efficient massively parallel implementation of some combinatorial algorithms. Theoretical Computer Science, 162(2):297–322, 1996. 65zbMATHCrossRefMathSciNetGoogle Scholar
  12. [12]
    T.-S. Hsu, V. Ramachandran, and N. Dean. Implementation of parallel graphalgorithms on a massively parallel SIMD computer with virtual processing. In Proc. 9th Int’l Parallel Processing Symp., pages 106–112, Santa Barbara, CA, April 1995. 65Google Scholar
  13. [13]
    J. JáJá. An Introduction to Parallel Algorithms. Addison-Wesley Publishing Company New York, 1992. 64, 66, 67zbMATHGoogle Scholar
  14. [14]
    J. Keller, C.W. Keßler, and J. L. Träff. Practical PRAM Programming. John Wiley & Sons, 2001. 65Google Scholar
  15. [15]
    S.R. Kosaraju and A.L. Delcher. Optimal parallel evaluation of tree-structured computations by raking (extended abstract). Technical report, The Johns Hopkins University, 1987. 65, 66, 68Google Scholar
  16. [16]
    G. L. Miller and J.H. Reif. Parallel tree contraction and its application. In Proc. 26th Ann. IEEE Symp. Foundations of Computer Science (FOCS), pages 478–489, Portland, OR, October 1985. IEEE Press. 65Google Scholar
  17. [18]
    M. Reid-Miller. List ranking and list scan on the Cray C-90. J. Comput. Syst. Sci., 53(3):344–356, December 1996. 65zbMATHCrossRefGoogle Scholar
  18. [19]
    J.H. Reif, editor. Synthesis of Parallel Algorithms. Morgan Kaufmann Publishers, 1993. 64Google Scholar
  19. [20]
    J. Sibeyn. Better trade-offs for parallel list ranking. In Proc. 9th Ann. Symp. Parallel Algorithms and Architectures (SPAA-97), pages 221–230, Newport, RI, June 1997. ACM. 65Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • David A. Bader
    • 1
  • Sukanya Sreshta
    • 1
  • Nina R. Weisse-Bernstein
    • 1
  1. 1.Department of Electrical and Computer EngineeringUniversity of New MexicoAlbuquerqueUSA

Personalised recommendations