Explicit Management of Memory Hierarchy

Nieplocha, Jarek; Harrison, Robert; Foster, Ian

doi:10.1007/978-94-011-5514-4_11

Jarek Nieplocha⁴,
Robert Harrison⁴ &
Ian Foster⁵

Part of the book series: NATO ASI Series ((ASHT,volume 30))

134 Accesses

Abstract

All scalable parallel computers feature a memory hierarchy, in which some locations are “closer” to a particular processor than others. The hardware in a particular system may support a shared memory or message passing programming model, but these factors effect only the relative costs of local and remote accesses, not the system’s fundamental Non-Uniform Memory Access (NUMA) characteristics. Yet while the efficient management of memory hierarchies is fundamental to high performance in scientific computing, existing parallel languages and tools provide only limited support for this management task. Recognizing this deficiency, we propose abstractions and programming tools that can facilitate the explicit management of memory hierarchies by the programmer, and hence the efficient programming of scalable parallel computers. The abstractions comprise local arrays, global (distributed) arrays, and disk resident arrays located on secondary storage. The tools comprise the Global Arrays library, which supports the transfer of data between local and global arrays, and the Disk Resident Arrays (DRA) library, for transferring data between global and disk resident arrays. We describe the shared memory NUMA model implemented in the tools, discuss extensions for wide area computing environments, and review major applications of the tools, which currently total over one million lines of code.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

C. Seitz. High-performance workstations + high-speed interconnect ≥ multicomputers. In Scalable Parallel Libraries Conf., Missisippi State, 1993.
Google Scholar
E.F. Van der Velde. Book review on ‘Studies in Computational Science: Parallel Programming Paradigms’. IEEE Computational Science and Engineering,2(4):85–87, 1995.
Article Google Scholar
D. Patterson and J. Hennessy. Computer Architecture: A Quantitative Approach. Morgan Kaufmann, 1990.
Google Scholar
T. Sterling, P. Messina, and P. Smith. Enabling Technologies for Petatops Software. MIT Press, 1995.
Google Scholar
I. Foster, R. Olson, and S. Tuecke. Productive parallel programming: The PCN approach. Scientific Programming, 1(1):51–66, 1992.
Google Scholar
P. Hatcher and §®. Quinn. Data-Parallel Programming on MIMD Computers. MITPress, 1991.
MATH Google Scholar
C. Koelbel, D. Loveman,R. Schreiber, G. S. Jr., and M. E. Zosel. The High Performance Fortran Handbook. The MIT Press, Cambridge, MA, 1994.
Google Scholar
Message Passing Interface Forum. MPI: A Message-Passing Interface. University of Tennessee, Knoxville, Ten., May 5, 1994.
Google Scholar
D. Culler, A. Dusseau, S. Goldstein, A. Krishnamurthy, T. v. E. S. Lumetta, and K. Yelick. Parallel programming in Split-C. In Proc. Supercomputing’93, pages 262–273. ACM Press, 1993.
Google Scholar
K. Chandy and C. Kesselman. CC++: A declarative concurrent object oriented programming notation. In Research Directions in Object Oriented Programming. The MIT Press, Cambridge, MA, 1993.
Google Scholar
C. Hoare. Communicating sequential processes. Communications of the ACM, 21(8):666–677, 1978.
Article MathSciNet MATH Google Scholar
M. Stumm and S. Zhou. Algorithms for implementing distributed shared memory. IEEE Computer, 24(5):54–64, 1990.
Article Google Scholar
High Performance Fortran Forum. High Performance Fortran language specification, version 1.0. Technical Report CRPC-TR92225, Center for Research on Parallel Computation, Rice University, Houston, Tex., 1993.
Google Scholar
H. Kung. Synchronized and asynchronous parallel algorithms for multiprocessors. In J. Traub, editor, Algorithms and Complexity, pages 153–200. Academic Press, 1976.
Google Scholar
C. Amza, A. Cox, S. Dwarkadas, P. Keleher, H. Lu, R. Rajamony, W. Yu, and W. Zwaenepoel. Tread-marks: Shared memory computing on networks of workstations. IEEE Computer, 29(2):18–28, 1996.
Article Google Scholar
B. Bershad, M. Zekauskas, and W. Sawdon. The Midway distributed shared memory system. In Proc. ‘93 CompCon Conference, pages 528–537, 1993.
Google Scholar
N. Carriero and D. Gelernter. How To Write Parallel Programs. A First Course. The MIT Press, Cambridge, Mass., 1990.
Google Scholar
H. Bal, M. Kaashoek, and A. Tanenbaum. Orca: A language for parallel programming of distributed systems. IEEE Trans. Software Eng., 18(3):190–205, 1992.
Article Google Scholar
J. Nieplocha, R. Harrison, and R. Littlefield. Global Arrays: A portable ‘shared-memory” programming model for distributed memory computers. In Proceedings of Supercomputing 1994, pages 340–349. IEEE Computer Society Press, 1994.
Google Scholar
E. D’Azevedo and C. Romine. DOLIB: Distributed object library. Technical Report ORNL/TM-12744, Oak Ridge National Lab., Oak Ridge, TN, 1994.
Google Scholar
R.H., Saavedra, R. Gaines, and M. Carlton. Micro benchmark analysis of the KSR1. In Proceedings of Supercomputing 93,pages 202–213. IEEE Computer Society, 1993.
Google Scholar
Convex Computer Corp. Exemplar 5PPI00O/120O Architecture. Convex Computer Corp., Richardson, Tex., 1995.
Google Scholar
MPI Forum. MPI-2. information available from http://www.mcs.anl.gov/mpi.
J. Nieplocha, R. Harrison, and R. Littlefield. Global Arrays: A nonuniform memory access programming model for high-performance computers. The Journal of Supercomputing, 10:197–220, 1996.
Article Google Scholar
J. Nieplocha and I. Foster. Disk Resident Arrays: An array-oriented I/O library for out-of-core computations. In Proceedings of Frontiers of Massively Parallel Computation. IEEE Computer Society Press, 1996.
Google Scholar
J. Nieplocha and R. Harrison. Shared-memory NUMA programming on I-WAY. In Proc. of IEEE HPDC5, pages 432–441. IEEE Computer Society Press, 1996.
Google Scholar
D. Bernholdt et al. Parallel computational chemistry made easier: The development of NWChem. Intl J. Quantum Chem. Symp., 29:475–483, 1995.
Article Google Scholar
I. Foster and C. Kesselman. Globus: A metacomputing infrastructure toolkit. In Proc. 3rd Workshop on Environments and Tools for Parallel Scientific Computing. SIAM, 1996. to appear; see also http://www.globus.org.
Google Scholar
T. DeFanti, I. Foster, M. Papka, R. Stevens, and T. Kuhfuss. Overview of the I-WAY: Wide area visual supercomputing. Int. J. Supercomputing Applications,10(2):123–130, 1996.
Article Google Scholar
S. Pakin, M. Launa, and A. Chien. High performance messaging on workstations: Illinois Fast Messages (FM) for Myrinet. In Proc. Supercomputing’95, 1995.
Google Scholar
T. von Eicken, V. Avula, A. Basu, and V. Buch. Low-latency communication over ATM networks using active messages. IEEE Micro, 15(1):46–53, 1995.
Article Google Scholar
M. Oguchi, H. Aida, and T. Saito. A proposal for a DSM architecture suitable for a widely distributed environment and its evaluation. In Proc. 4-th IEEE Int. Symp. HPDC. IEEE CS Press, 1995.
Google Scholar
P. Corbett, D. Feitelson, S. Fineberg, Y. Hsu, W. Nitzberg, J.-P. Prost, M. Snir, B. Traversat, and P. Wong. Overview of the MPI-IO parallel I/O interface. In IPPS ‘95 Workshop on Input/Output in Parallel and Distributed Systems, pages 1–15, April 1995.
Google Scholar
A. Choudhary, R. Bordawekar, M. Harry, R. Krishnaiyer, R. Ponnusamy, T. Singh, and R. Thakur. PASSION: Parallel and scalable software for input-output. Technical Report SOCS-636, NPAC, Syracuse, NY, 1994.
Google Scholar
K. E. Seamons, Y. Chen, P. Jones, J. Jozwiak, and M. Winslett. Server-directed collective I/O in Panda. In Proc. Supercomputing ‘85, December 1995.
Google Scholar

Download references

Author information

Authors and Affiliations

Pacific Northwest National Laboratory, Richland, WA, 99352, USA
Jarek Nieplocha & Robert Harrison
Argonne National Laboratory, Argonne, IL, 60439, USA
Ian Foster

Authors

Jarek Nieplocha
View author publications
You can also search for this author in PubMed Google Scholar
Robert Harrison
View author publications
You can also search for this author in PubMed Google Scholar
Ian Foster
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Rende(Cosenza), Italy
Lucio Grandinetti
Seattle, Washington, USA
Janusz Kowalik
Bratislava, Slovakia
Marian Vajtersic

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Nieplocha, J., Harrison, R., Foster, I. (1997). Explicit Management of Memory Hierarchy. In: Grandinetti, L., Kowalik, J., Vajtersic, M. (eds) Advances in High Performance Computing. NATO ASI Series, vol 30. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-5514-4_11

Download citation

DOI: https://doi.org/10.1007/978-94-011-5514-4_11
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-010-6322-7
Online ISBN: 978-94-011-5514-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics