Abstract
The broadening disparity in the performance of input/output (I/O) devices and the performance of processors and communication links on parallel systems is a major obstacle to achieving high performance for a wide range of parallel applications. I/O hardware and file system parallelism are the keys to bridging this performance gap. A prerequisite to the development of efficient parallel file systems is detailed characterization of the I/O demands of parallel applications. In this paper, we present a comparative study of the I/O access patterns commonly found in I/O intensive parallel applications. Using the Pablo performance analysis environment and its I/O extensions we captured application I/O access patterns and analyzed their interactions with current parallel I/O systems. This analysis has proven instrumental in guiding the development of new application programming interfaces (APIs) for parallel file systems and in developing effective file system policies that can adaptively respond to complex application I/O requirements.
This work was supported in part by the Defense Advanced Research Projects Agency under DARPA contracts DABT63-94-C0049 (SIO Initiative), DAVT63-91-C-0029, DABT63-93-C-0040, F30602-96-C-0161, DABT63-96-C-0027, and F30602-96-2-0264, by the National Science Foundation under grant NSF ASC 92-12369, and by the Aeronautics and Space Administration under NASA contracts NGT-51023, USRA 5555-22, and NAG-1-613.
Preview
Unable to display preview. Download preview PDF.
References
Bennett, R., Bryant, K., Sussman, A., Das, R., and Saltz, J. Jovian: A framework for optimizing parallel I/O. In Proceedings of the Scalable Parallel Libraries Conference (October 1994), IEEE Computer Society Press, pp. 10–20.
Bordawekar, R., Thakur, R., and Choudhary, A. Efficient compilation of out-of-core data parallel programs. Tech. Rep. SCCS-622, NPAC, April 1994.
Corbett, P. F., Prost, J.-P., Demetriou, C., Gibson, G., Riedel, E., Zelenka, J., Chen, Y., Felten, E., Li, K., Hartman, J., Peterson, L., Bershad, B., Wolman, A., and Aydt, R. Proposal for a common parallel file system programming interface version 1.0, September 1996.
Crandall, P., Aydt, R. A., Chien, A. A., and Reed, D. A. Input/Output characterization of scalable parallel applications. In Supercomputing 1995 (1996).
Foster, I., and Nieplocha, J. ChemIO: High-performance I/O for computational chemistry applications, http://www.mcs.anl.gov/chemio/, February 1996.
Kotz, D., and Nieuwejaar, N. Dynamic file-access characteristics of a production parallel scientific workload. In Supercomputing '94 (November 1994).
Madhyastha, T., and Reed, D. A. Intelligent, adaptive file system policy selection. In Proceedings of Frontiers'96 (1996).
Miller, E. L., and Katz, R. H. Input/Output behavior of supercomputer applications. In Supercomputing '91 (November 1991), pp. 567–576.
Nieuwejaar, N., and Kotz, D. The Galley parallel file system. In Proceedings of the 10th ACM International Conference on Supercomputing (May 1996).
Pasquale, B. K., and Polyzos, G. C. Dynamic I/O characterization of I/O intensive scientific applications. In Proceedings of Supercomputing '94 (November 1994), pp. 660–669.
Poole, J. T. Scalable I/O Initiative. California Institute of Technology, Available at http://www.ccsf.caltech.edu/SIO/, 1996.
Purakayastha, A., Ellis, C. S., Kotz, D., Nieuwejaar, N., and Best, M. Characterizing parallel file-access patterns on a large-scale multiprocessor. In Proceedings of the Ninth International Parallel Processing Symposium (April 1995), pp. 165–172.
Reed, D. A., Aydt, R. A., Noe, R. J., Roth, P. C., Shields, K. A., Schwartz, B. W., and Tavera, L. F. Scalable performance analysis: The Pablo performance analysis environment. In Proceedings of the Scalable Parallel Libraries Conference, A. Skjellum, Ed. IEEE Computer Society, 1993, pp. 104–113.
Reed, D. A., Elford, C. L., Madhyastha, T., Scullin, W. H., Aydt, R. A., and Smirni, E. I/O, performance analysis, and performance data immersion. In Proceedings of MASCOTS '96 (Feb. 1996), pp. 1–12.
Smirni, E., Aydt, R. A., Chien, A. A., and Reed, D. A. I/O requirements of scientific applications: An evolutionary view. In High Performance Distributed Computing (1996), pp. 49–59.
Toledo, S., and Gustavson, F. G. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations. In Fourth Workshop on Input/Output in Parallel and Distributed Systems (May 1996), pp. 28–40.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Smirni, E., Reed, D.A. (1997). Workload characterization of input/output intensive parallel applications. In: Marie, R., Plateau, B., Calzarossa, M., Rubino, G. (eds) Computer Performance Evaluation Modelling Techniques and Tools. TOOLS 1997. Lecture Notes in Computer Science, vol 1245. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0022205
Download citation
DOI: https://doi.org/10.1007/BFb0022205
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63101-9
Online ISBN: 978-3-540-69131-0
eBook Packages: Springer Book Archive