Skip to main content

Thread Migration in a Parallel Graph Reducer

  • Conference paper
  • First Online:
Implementation of Functional Languages (IFL 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2670))

Included in the following conference series:

  • 259 Accesses

Abstract

To support high level coordination, parallel functional languages need effective and automatic work distribution mechanisms. Many implementations distribute potential work, i.e. sparks or closures, but there is good evidence that the performance of certain classes of program can be improved if current work, or threads, are also distributed. Migrating a thread incurs significant execution cost and requires careful scheduling and an elaborate implementation.

This paper describes the design, implementation and performance of thread migration in the GUM runtime system underlying Glasgow parallel Haskell (GpH). Measurements of nontrivial programs on a highlatency cluster architecture show that thread migration can improve the performance of data-parallel and divide-and-conquer programs with low processor utilisation. Thread migration also reduces the variation in performance results obtained in separate executions of a program. Moreover, migration does not incur significant overheads if there are no migratable threads, or on a single processor. However, for programs that already exhibit good processor utilisation, migration may increase performance variability and very occasionally reduce performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Barak and O. Laádan. The MOSIX Multicomputer Operating System for High-Performance Cluster Computing. Future Generation Computer Systems, 13(4–5):361–372, 1998.

    Article  Google Scholar 

  2. R. Baron, R. Rashid, E. Siegel, A. Tevanian, and M. Young. Mach-1: An Operating Environment for Large-Scale Multiprocessor Applications. IEEE Software, 2(4):65–67, July 1985.

    Article  Google Scholar 

  3. R.D. Blumofe, C.F. Joerg, C.E. Leiserson, K.H. Randall, and Y. Zhou. Cilk: An Efficient Multithreaded Runtime System. In PPoPP’95 — Symp. on Principles and Practice of Parallel Programming, pages 207–216, Santa Barbara, USA, 1995.

    Google Scholar 

  4. S. Breitinger, R. Loogen, Y. Ortega Mallén, and R. Peña Marí. Eden — The Paradise of Functional Concurrent Programming. In EuroPar’96 — European Conf. on Parallel Processing, LNCS 1123, pages 710–713, Lyon, France, 1996. Springer.

    Google Scholar 

  5. T. Bülck, A. Held, W. Kluge, S. Pantke, C. Rathsack, S-B. Scholz, and R. Schröder. Experience with the Implementation of a Concurrent Graph Reduction System on an nCUBE/2 Platform. In CONPAR’94 — Conf. on Parallel and Vector Processing, LNCS 854, pages 497–508. Springer, 1994.

    Google Scholar 

  6. J.S. Chase, F.G Amador, E.D. Lazowska, H.M Levy, and R.J. Littlefield. the Amber System: Parallel Programming on a Network of Multiprocessors. In Symp. on Operating Systems Principles, pages 147–158, Litchfield Park, AZ, USA, 1989.

    Google Scholar 

  7. D.E. Culler, S.C. Goldstein, K.E. Schauser, and T. von Eicken. TAM — A Compiler Controlled Threaded Abstract Machine. J. of Parallel and Distributed Computing, 18:347–370, June 1993.

    Article  Google Scholar 

  8. A. R. Du Bois, R. Pointon, H-W. Loidl, and P. W. Trinder. Implementing Declarative Parallel Bottom-Avoiding Choice. In 14th Symposium on Computer Architecture and High Performance Computing, pages 82–89, Vitoria, Brazil, october 2002. IEEE Press.

    Google Scholar 

  9. K. Hammond and S.L. Peyton Jones. Some Early Experiments on the GRIP Parallel Reducer. In IFL’90 — Intl. Workshop on the Parallel Implementation of Functional Languages, pages 51–72, Nijmegen, The Netherlands, June 1990.

    Google Scholar 

  10. K. Hammond and S.L. Peyton Jones. Profiling Scheduling Strategies on the GRIP Multiprocessor. In IFL’92 — Intl. Workshop on the Parallel Implementation of Functional Languages, pages 73–98, RWTH Aachen, Germany, September 1992.

    Google Scholar 

  11. Impala. Impala-(IMplicitly PArallel LAnguage Application Suite). <URL:http://www.csg.lcs.mit.edu/impala/>, July 2001.

  12. A. Itzkovitz, A. Schuster, and L. Shalev. Thread Migration and its Applications in Distributed Shared Memory Systems. J. of Systems and Software, 42(1):71–87, 1998.

    Article  Google Scholar 

  13. M. H. G. Kesseler. The Implementation of Functional Languages on Parallel Machines with Distributed Memory. PhD thesis, Wiskunde en Informatica, Katholieke Universiteit van Nijmegen, The Netherlands, 1996.

    Google Scholar 

  14. H. Kingdon, D.R. Lester, and G. Burn. The HDG-machine: a Highly Distributed Graph-Reducer for a Transputer Network. Computer Journal, 34(4):290–301, 1991.

    Article  Google Scholar 

  15. H-W. Loidl. Granularity in Large-Scale Parallel Functional Programming. PhD thesis, University of Glasgow, March 1998.

    Google Scholar 

  16. H-W. Loidl and K. Hammond. Making a Packet: Cost-Effective Communication for a Parallel Graph Reducer. In IFL’96 — Intl. Workshop on the Implementation of Functional Languages, LNCS 1268, pages 184–199, Bonn/Bad-Godesberg, Germany, September 1996. Springer.

    Google Scholar 

  17. H-W. Loidl, U. Klusik, K. Hammond, R. Loogen, and P.W. Trinder. GpH and Eden: Comparing Two Parallel Functional Languages on a Beowulf Cluster. In SFP’00 — Scottish Functional Programming Workshop, volume 2 of Trends in Functional Programming, pages 39–52, University of St Andrews, Scotland, July 2000. Intellect.

    Google Scholar 

  18. H-W. Loidl, P.W. Trinder, and C. Butz. Tuning Task Granularity and Data Locality of Data Parallel GpH Programs. Parallel Processing Letters, 11(4):471–486, December 2001.

    Google Scholar 

  19. H-W. Loidl, P.W. Trinder, K. Hammond, S.B. Junaidu, R.G. Morgan, and S.L. Peyton Jones. Engineering Parallel Symbolic Programs in GPH. Concurrency — Practice and Experience, 11:701–752, 1999.

    Article  Google Scholar 

  20. D.K. Lowenthal, V.W. Freeh, and G.R. Andrews. Using Fine-Grain Threads and Run-Time Decision Making in Parallel Computing. J. of Parallel and Distributed Computing, 37:42–54, 1996.

    Article  Google Scholar 

  21. E. Mascarenhas and V. Rego. Ariadne: Architecture of a Portable Threads System Supporting Thread Migration. Software — Practice and Experience, 26(3):327–356, March 1996.

    Article  Google Scholar 

  22. B. Mathiske, F. Matthes, and J.W. Schmidt. On Migrating Threads. In Intl. Workshop on Next Generation Information Technologies and Systems, Naharia, Israel, June 1995.

    Google Scholar 

  23. D. Milojicić, F. Douglis, and R. Weeler. Mobility: Processes, Computers, and Agents. Addison-Wesley, Reading, MA, USA, 1999.

    Google Scholar 

  24. R.S. Nikhil. Parallel Symbolic Computing in Cid. In Workshop on Parallel Symbolic Computing, LNCS 1068, pages 217–242, Beaune, France, Oct. 1995. Springer.

    Google Scholar 

  25. R.S. Nikhil and A. Singla. Automatic Granularity Control and Load-Balancing in Cid. Technical report, DEC Research Labs, December 1994.

    Google Scholar 

  26. S.L. Peyton Jones. Implementing Lazy Functional Languages on Stock Hardware: the Spineless Tagless G-machine. J. of Functional Programming, 2(2):127–202, July 1992.

    Article  MATH  Google Scholar 

  27. D. Ridge, D. Becker, P. Merkey, and T. Sterling. Beowulf: Harnessing the Power of Parallelism in a Pile-of-PCs. In IEEE Aerospace Conference, pages 79–91, 1997.

    Google Scholar 

  28. P.W. Trinder, K. Hammond, H-W. Loidl, and S.L. Peyton Jones. Algorithm + Strategy = Parallelism. J. of Functional Programming, 8(1):23–60, January 1998.

    Article  MATH  Google Scholar 

  29. P.W. Trinder, K. Hammond, J.S. Mattson Jr., A.S. Partridge, and S.L. Peyton Jones. GUM: a Portable Parallel Implementation of Haskell. In PLDI’96 — Conf. on Programming Language Design and Implementation, pages 79–88, Philadephia, USA, May 1996.

    Google Scholar 

  30. T. von Eicken, D.E. Culler, S.C. Goldstein, and K.E. Schauser. Active Messages: a Mechanism for Integrated Communication and Computation. In ISCA’92 — Intl. Symp. on Computer Architecture, pages 256–266, Gold Coast, Australia, May 1992. ACM Press.

    Google Scholar 

  31. P. Wegner. Programming Languages, Information Structures and Machine Organisation. McGraw-Hill, New York, 1971.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Du Bois, A.R., Loidl, HW., Trinder, P. (2003). Thread Migration in a Parallel Graph Reducer. In: Peña, R., Arts, T. (eds) Implementation of Functional Languages. IFL 2002. Lecture Notes in Computer Science, vol 2670. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44854-3_13

Download citation

  • DOI: https://doi.org/10.1007/3-540-44854-3_13

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40190-2

  • Online ISBN: 978-3-540-44854-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics