Abstract
This paper introduces the write-once/read-many XMLtape/ARC storage approach for Digital Objects and their constituent datastreams. The approach combines two interconnected file-based storage mechanisms that are made accessible in a protocol-based manner. First, XML-based representations of multiple Digital Objects are concatenated into a single file named an XMLtape. An XMLtape is a valid XML file; its format definition is independent of the choice of the XML-based complex object format by which Digital Objects are represented. The creation of indexes for both the identifier and the creation datetime of the XML-based representation of the Digital Objects facilitates OAI-PMH-based access to Digital Objects stored in an XMLtape. Second, ARC files, as introduced by the Internet Archive, are used to contain the constituent datastreams of the Digital Objects in a concatenated manner. An index for the identifier of the datastream facilitates OpenURL-based access to an ARC file. The interconnection between XMLtapes and ARC files is provided by conveying the identifiers of ARC files associated with an XMLtape as administrative information in the XMLtape, and by including OpenURL references to constituent datastreams of a Digital Object in the XML-based representation of that Digital Object.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
International Organization for Standardization. ISO/IEC 21000-2:2003. Information technology – Multimedia framework (MPEG-21) – Part 2: Digital Item Declaration (1st ed.) (2003)
The Library of Congress: The Network Development and MARC Standards Office. Metadata Encoding and Transmission Standard (METS) (November 2004) Retrieved from, http://www.loc.gov/standards/mets/
International Organization for Standardization ISO 14721:2003: Space data and information transfer systems – Open Archival Information System – Reference model (1st ed.) (2003)
Burner, M., Kahle, B.: Arc File format, (September 15,1996), Retrieved from, http://www.archive.org/web/researcher/ArcFileFormat.php
Van de Sompel, H., Bekaert, J., Liu, X., Balakireva, L., Schwander, T. (accepted submission): aDORe: a Modular, Standards-based Digital Object Repository. The Computer Journal (2005), Preprint at, http://arxiv.org/abs/cs.DL/0502028
Bekaert, J., Hochstenbach, P., Van de Sompel, H.: Using MPEG-21 DIDL to represent complex Digital Objects in the Los Alamos National Laboratory Digital Library. D-Lib Magazine 9(11) (November 2003), Retrieved from, http://dx.doi.org/10.1045/november2003-bekaert
Bekaert, J., Van de Walle, R., Van de Sompel, H.: (submitted): Representing Digital Objects using MPEG-21 Digital Item Declaration. International Journal on Digital Libraries (2005)
IMS Global Learning Consortium: IMS content packaging XML binding specification version 1.1.3. (2003, June), Retrieved from, http://www.imsglobal.org/content/packaging/
National Information Standards Organization. In: ANSI/NISO Z39.84-2000: Syntax for the Digital Object Identifier. NISO Press, Bethesda (May 2000)
Leach, P., Mealling, M., Salz, R.: A UUID URN Namespace. In: IETF Internet-Draft, expired on July 1, 2004, 3rd edn. (January 2004)
Van de Sompel, H., Hammond, T., Neylon, E., Weibel, S.: The “info” URI scheme for information assets with identifiers in public namespaces, 2nd edn., (January 12, 2005), Retrieved from, http://info-uri.info/registry/docs/drafts/draft-vandesompel-info-uri-03.txt
Van de Sompel, H.: XMLtape XML Schema, http://purl.lanl.gov/STB-RL/schemas/2005-01/tape.xsd
International Organization for Standardization. DIDL XML Schema, http://purl.lanl.gov/STB-RL/schemas/2004-11/DIDL.xsd
netarchive.dk, http://www.netarchive.dk
National Information Standards Organization (in press). In: ANSI/NISO Z39.88-2004: The OpenURL Framework for Context-Sensitive Services. NISO Press, Bethesda (2004)
Lagoze, C., Van de Sompel, H., Nelson, M.L., Warner, S. (eds.): The Open Archives Initiative protocol for metadata harvesting, 2nd edn. (June 2002), Retrieved from, http://www.openarchives.org/OAI/openarchivesprotocol.htm
Berkeley, D.B.: Java Edition, http://www.sleepycat.com/products/je.shtml
Online Computer Library Center. OAICat (October 2004), Retrieved from, http://www.oclc.org/research/software/oai/cat.htm
DOM Level 3 API, http://www.w3.org/DOM/DOMTR
International Internet Preservation Consortium, http://netpreserve.org/about/index.php
Christensen, S., Stack, M.: ARC file Revision 3.0 Proposal. (September 2004), Retrieved from, http://archive-access.sourceforge.net/arc_revision_3/index.pdf
Liu, X., Van de Sompel, H.: ARC File Format Revision 3.0 : Feedback from the Los Alamos National Laboratory (November 2004), Retrieved from, http://public.lanl.gov/herbertv/papers/arc3-20041101.pdf
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, X., Balakireva, L., Hochstenbach, P., Van de Sompel, H. (2005). File-Based Storage of Digital Objects and Constituent Datastreams: XMLtapes and Internet Archive ARC Files. In: Rauber, A., Christodoulakis, S., Tjoa, A.M. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2005. Lecture Notes in Computer Science, vol 3652. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551362_23
Download citation
DOI: https://doi.org/10.1007/11551362_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28767-4
Online ISBN: 978-3-540-31931-3
eBook Packages: Computer ScienceComputer Science (R0)