Skip to main content

Meta-data Management System for High-Performance Large-Scale Scientific Data Access

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1970))

Abstract

Many scientific applications manipulate large amount of data and, therefore, are parallelized on high-performance computing systems to take advantage of their computational power and memory space. The size of data processed by these large-scale applications can easily overwhelm the disk capacity of most systems. Thus, tertiary storage devices are used to store the data. The parallelization of this type of applications requires understanding of not only the data partition pattern among multiple processors but also the underlying storage architectures and the data storage pattern. In this paper, we present a meta-data management system which uses a database to record the information of datasets and manage these meta data to provide suitable I/O interface. As a result, users specify dataset names instead of data physical location to access data using optimal I/O calls without knowing the underlying storage structure.We use an astrophysics application to demonstrate that the management system can provide convenient programming environment with negligible database access overhead.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. C. Ellis and D. Kotz. Prefetching in File Systems forMIMDMultiprocessors. In International Conference on Parallel Processing, volume 1, pages 306–314, August 1989.

    Google Scholar 

  2. P. Cao, E. Felten, and K. Li. Application-Controlled File Caching Policies. In the 1994 Summer USENIX Technical Conference, pages 171–182, June 1994.

    Google Scholar 

  3. J. del Rosario and A. Choudhary. High Performance I/O for Parallel Computers: Problems and Prospects. IEEE Computer, March 1994.

    Google Scholar 

  4. J. Karpovich, A. Grimshaw, and J. French. Extensible File Systems (ELFS): An Object-Oriented Approach to High Performance File I/O. In The Ninth Annual Conference on Object-Oriented Programming Systems, pages 191–204, October 1994.

    Google Scholar 

  5. W. Gropp, E. Lusk, and R. Thakur. Using MPI-2: Advanced Features of the Message-Passing Interface. The MIT Press, Cambridge, MA, 1999.

    Book  Google Scholar 

  6. G. Memik et al. APRIL: A Run-Time Library for Tape Resident Data. In NASA Goddard Conference on Mass Storage Systems and Technologies, March 2000.

    Google Scholar 

  7. X. Shen and A. Choudhary. I/O Optimization and Evaluation for Tertiary Storage Systems. In submitted to International Conference on Parallel Processing, 2000.

    Google Scholar 

  8. X. Shen et al. A Novel Application Development Environment for Large-Scale Scientific Computations. In International Conference on Supercomputing, May 2000.

    Google Scholar 

  9. A. Malagoli et al. A Portable and Efficient Parallel Code for Astrophysical Fluid Dynamics. http://astro.uchicago.edu/Computing/On Line/cfd95/camelse. html.

  10. IBM. RS/6000 SP Software: Parallel I/O File System, 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Liao, Wk., Shen, X., Choudhary, A. (2000). Meta-data Management System for High-Performance Large-Scale Scientific Data Access. In: Valero, M., Prasanna, V.K., Vajapeyam, S. (eds) High Performance Computing — HiPC 2000. HiPC 2000. Lecture Notes in Computer Science, vol 1970. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44467-X_26

Download citation

  • DOI: https://doi.org/10.1007/3-540-44467-X_26

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41429-2

  • Online ISBN: 978-3-540-44467-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics