Abstract
Research projects produce huge amounts of data, which have to be stored and analyzed immediately after the acquisition. Storing and analyzing of high data rates are normally not possible within the detectors and can be worse if several detectors with similar data rates are used within a project. In order to store the data for analysis, it has to be transferred on an appropriate infrastructure, where it is accessible at any time and from different clients. The Large Scale Data Facility (LSDF), which is currently developed at KIT, is designed to fulfill the requirements of data intensive scientific experiments or applications. Currently, the LSDF consists of a testbed installation for evaluating different technologies. From a user point of view, the LSDF is a huge data sink, providing in the initial state 6 PB of storage, and will be accessible via a couple of interfaces. As a user is not interested in learning dozens of APIs for accessing data a generic API, the ADALAPI, has been designed, providing unique interfaces for the transparent access to the LSDF over different technologies. The present contribution evaluates technologies useable for the development of the LSDF to meet the requirements of various scientific projects. Also, the ADALAPI and the first GUI based on it are introduced.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
W. Allcock, J. Bresnahan, R. Kettimuthu, and M. Link, “The Globus Striped GridFTP Framework and Server,” in Supercomputing, 2005. Proceedings of the ACM/IEEE SC 2005 Conference, November 2005, pp. 54–54.
J. Bresnahan, M. Link, R. Kettimuthu, D. Fraser, and I. Foster, “GridFTP Pipelining,” in Proceedings of the 2007 TeraGrid Conference, June 2007.
S. Hermann, H. Marten, and J. v. Wezel, “Operating a TIER1 centre as part of a grid environment,” in Proceedings of the Conference on Computing in High Energy and Nuclear Physics (CHEP 2006), February 2006.
T. S. Pettersson and P. Lefèvre, “The large hadron collider: conceptual design,” CERN, Geneva, Tech. Rep. CERN-AC-95-05 LHC, October 1995.
The D-Grid project. (2010, January) D-Grid-Initiative: D-Grid-Initiative. [Online]. Available: http://www.d-grid.de/index.php?
Sun Microsystems, “Nfs: Network file system protocol specification,” The Internet Engineering Task Force, Tech. Rep. RFC 1094, March 1989.
The Apache Software Foundation. (2010, January) Welcome to Apache Hadoop! [Online]. Available: http://hadoop.apache.org/
J. Dean and S. Ghemawat, “Mapreduce: Simplified data processing on large clusters,” in Proceedings of the 6th Symposium on Operating Systems Design and Implementation, 2004, pp. 137–149.
F. Schmuck and R. Haskin, “Gpfs: A shared-disk file system for large computing clusters,” in Proceedings of the 2002 Conference on File and Storage Technologies (FAST), 2002, pp. 231–244.
Microsoft Corporation. (2010, January) Microsoft SMB Protocol and CIFS Protocol Overview (Windows). [Online]. Available: http://msdn.microsoft.com/en-us/library/aa365233%28VS.85%29.aspx
The Samba team. (2010, January) Samba - opening windows to a wider world. [Online]. Available: http://samba.org/
J. H. Howard, M. L. Kazar, S. G. Menees, D. A. Nichols, M. Satyanarayanan, R. N. Sidebotham, and M. J. West, “Scale and performance in a distributed file system,” ACM Transactions on Computer Systems, vol. 6, no. 1, pp. 51–81, 1988.
Gluster Software India. (2010, January) Gluster: Open Source Clustered Storage. Easy-to-Use Scale-Out File System. [Online]. Available: http://www.gluster.com/
P. H. Carns and W. B. Ligon III and R. B. Ross and R. Thakur, “Pvfs: A parallel file system for linux clusters,” in Proceedings of the 4th Annual Linux Showcase and Conference, Atlanta, GA, October 2000, pp. 317–327.
C. Baru, R. Moore, A. Rajasekar, and M. Wan, “The SDSC storage resource broker,” in CASCON ’98: Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research. IBM Press, 1998, p. 5.
A. Rajasekar, M. Wan, R. Moore, and W. Schroeder, “A Prototype Rule-based Distributed Data Management System,” in HPDC workshop on ”Next Generation Distributed Data Management”, May 2006.
M. Ernst, P. Fuhrmann, M. Gasthuber, T. Mkrtchyan, and C. Waldman, “dCache, a Distributed Storage Data Caching System,” in Proceedings of the International CHEP 2001, Beijing, China, September 2001.
G. Lo Presti, O. Barring, A. Earl, R. Garcia Rioja, S. Ponce, G. Taurelli, D. Waldron, and M. Coelho Dos Santos, “CASTOR: A Distributed Storage Resource Facility for High Performance Data Processing at CERN,” 24th IEEE Conference on Mass Storage Systems and Technologies, 2007. MSST 2007., pp. 275–280, September 2007.
F. Donno, A. Ghiselli, L. Magnoni, and R. Zappi, “StoRM: GRID middleware for disk resource management,” in Proceedings of the International CHEP 2004, Interlaken, Switzerland, October 2004.
A. Shoshani, A. Sim, and J. Gu, “Storage Resource Managers,” in Grid Resource Management: State of the Art and Future Trends, 1st ed. Springer, 2003, ch. 20, pp. 321–340.
O. Tatebe, Y. Morita, S. Matsuoka, N. Soda, and S. Sekiguchi, “Grid Datafarm Architecture for Petascale Data Intensive Computing,” Cluster Computing and the Grid, 2002. 2nd IEEE/ACM International Symposium on, May 2002.
The xrootd team. (2009, January) The Scalla Software Suite: xrootd/cmsd. [Online]. Available: http://xrootd.slac.stanford.edu/
Lawrence Berkeley National Laboratory Scientific Data Management Research Group. (2009, January) BeStMan. [Online]. Available: http://datagrid.lbl.gov/bestman/
L. Abadie, P. Badino, J.-P. Baud, J. Casey, A. Frohner, G. Grosdidier, S. Lemaitre, G. Mccance, R. Mollon, K. Nienartowicz, D. Smith, and P. Tedesco, “Grid–Enabled Standards–based Data Management,” 24th IEEE Conference on Mass Storage Systems and Technologies, 2007. MSST 2007., pp. 60–71, Sept. 2007.
J. Postel and J. Reynolds, “File transfer protocol (ftp),” The Internet Engineering Task Force, Tech. Rep. RFC 959, October 1985.
M. Horowitz, C. Solutions, and S. Lunt, “Ftp security extensions,” The Internet Engineering Task Force, Tech. Rep. RFC 2228, October 1997.
T. Ylonen and SSH Communications Security Corp and C. Lonvick and Cisco Systems Inc., “The secure shell (ssh) authentication protocol,” The Internet Engineering Task Force, Tech. Rep. RFC 4252, January 2006.
dCache.org. (2009, January) The dCache Book. [Online]. Available: http://www.dcache.org/manuals/Book/
G. V. Laszewski, I. Foster, J. Gawor, P. Lane, N. Rehn, and M. Russell, “A java commodity grid kit,” Concurrency and Computation: Practice and Experience, vol. 13, Issues 8, pp. 645–662, 2001.
I. Foster, “Globus Toolkit Version 4: Software for Service-Oriented Systems,” in IFIP International Conference on Network and Parallel Computing, Springer-Verlag LNCS 3779, 2005, pp. 2–13.
C. Bauer and G. King, Java Persistence with Hibernate. Greenwich, CT, USA: Manning Publications Co., 2006.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this paper
Cite this paper
Sutter, M. et al. (2012). File Systems and Access Technologies for the Large Scale Data Facility. In: Davoli, F., Lawenda, M., Meyer, N., Pugliese, R., Węglarz, J., Zappatore, S. (eds) Remote Instrumentation for eScience and Related Aspects. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-0508-5_16
Download citation
DOI: https://doi.org/10.1007/978-1-4614-0508-5_16
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-0507-8
Online ISBN: 978-1-4614-0508-5
eBook Packages: EngineeringEngineering (R0)