Abstract
Digital libraries encompass an enormous range of functionality. One fundamental task of a digital library is to store digital content. Storage management in digital libraries is delegated conceptually to the repository component. As the amount and complexity of content continues to grow, the issue of scalability grows more pressing. Like traditional file systems, digital library repositories must be capable of scaling to billions of objects and petabytes (250 bytes) of total storage.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A. Acharya, M. Uysal, and J. Saltz. Active disks: Programming model, algorithms and evaluation. In Proceedings of Architectural Support for Programming Languages and Operating Systems (ASPLOS) VIII, pages 81–91. ACM, Oct. 1998.
N. Alon and M. Luby. A linear time erasure-resilient code with nearly optimal recovery. IEEE Transactions on Information Theory, 42 (6): 1732–1736, Nov. 1996.
C. Baru, R. Moore, A. Rajasekar, and M. Wan. The SDSC Storage Resource Broker. In Proceedings of CASCON ‘88, Toronto, Canada, 30 Nov.-3 Dec. 1998. IBM Conference. Available via http://npaci.edu/DICE/Pubs/srb.ps/DICE/Pubs/srb.ps.
M. Beck and T. Moore. The Intemet2 Distributed Storage Infrastructure project: An architecture for Internet content channels. Computer Networking and ISDN Systems, 30 (22–23): 2141–2148, 1998.
P. Berenbrink, A. Brinkmann, and C. Scheideler. Design of the PRESTO multimedia storage network. In Proceedings of the Workshop on Communication and Data Management in Large Networks (INFORMATIK 99), Oct. 1999.
S. Berson, R. Muntz, and W. Wong. Randomized data allocation for real-time disks. In COMPCON ‘86, pages 286–290, 1996.
C. Brandt, G. Kyriakaki, W. Lamotte, R. Luling, Y. Maragoudakis, Y. Mavraganis, K. Meyer, and N. Pappas. The SICMA multimedia server and virtual museum application. In Proceedings of the Third European Conference on Multimedia Applications, Services and Techniques, pages 83–96, 1998.
A. Crespo and H. Garcia-Molina. Archival storage for digital libraries. In Proceedings of Digital Libraries 98, pages 69–78, Pittsburgh, PA, 1998. ACM.
F. Dabek, M. F. Kaashoek, D. Karger, R. Morris, and I. Stoica. Wide-area cooperative storage with CFS. In Proceedings of the 2001 Symposium on Operating Systems Principles (SOSP), Banff, Canada, 21–24 Oct. 2001. Association for Computing Machinery.
R. Daniel, Jr. and C. Lagoze. Distributed Active Relationship in the Warwick Framework. In Proceedings of the Second IEEE Metadata Conference, Silver Spring, Maryland, USA, 16–17 Sept. 1997.
J. R. Davis and C. Lagoze. A protocol and server for a distributed digital technical report library. Technical report, Cornell University, June 1994. urn: hdl: ncstrl. cornell/TR94–1418.
R. Devine. Design and implementation of DDH: A distributed dynamic hashing algorithm. In Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms (FODO), 1993.
J. R. Douceur and W. J. Bolosky. A large-scale study of file-system contents. In SIGMETRICS ‘89, pages 59–70, Atlanta, Georgia, USA, 29 Apr.-6 May 1999.
P. Druschel and A. Rowstron. Past: Persistent and anonymous storage in a peer-to-peer networking environment. In Proceedings of the 8th IEEE Workshop on Hot Topics in Operating Systems (HotOS 2001), pages 65–70, Elmau/Oberbayern, Germany, May 2001.
Dublin Core Directorate. Dublin Core metadata initiative. W W W page, 2000. http://purl.org/dc/dc.
E. Durfee, D. Kiskis, and W. Birmingham. The agent architecture of the University of Michigan Digital Library. IEE/British Computer Society Proceedings on Software Engineering, 144 (1), Feb. 1997.
FIPS 180–1. Secure Hash Standard. U.S. Department of Commerce/NIST, Springfield, Virginia, USA, Apr. 1995.
G. A. Gibson, D. F. Nagle, W. Courtright II, N. Lanza, P. Mazaitis, M. Unangst, and J. Zelenka. NASD scalable storage systems. In Proceedings of USENIX 1999, Monterey, California, USA, 911 June 1999.
H. M. Gladney and J. B. Lotspiech. Safeguarding digital library contents and users: Assuring convenient security and data quality. D-Lib Magazine, May 1997. urn:hdl:cnri.dlib/may97-gladney.
R. L. Haskin and R. A. Lorie. On extending the functions of a relational database system. In M. Schkolnick, editor, Proceedings of the 1982 ACM SIGMOD International Conference on Management of Data, pages 207–212, Orlando, Florida, USA, 2–4 June 1982. ACM Press.
IETF Secretariat. Uniform Resource Names (URN) charter. World Wide Web page, Sept. 2000. http://www.ietf.org/html.charters/urn-charter.html/html.charters/urn-charter.html.
International Business Machines. IBM Digital Library, 1998. http://http://www.software.ibm.com/is/dig-lib//is/dig-lib/.
R. Kahn and R. Wilensky. A framework for distributed digital object services. Technical report, Corporation for National Research Initiatives, Reston, Virginia, USA, 13 May 1995. urn:hdl:cnri.dlib/tn95-01.
D. R. Karger, E. Lehman, F. T. Leighton, M. S. Levine, D. Lewin, and R. Panigrahy. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In Proceedings of the 29th Annual ACM Symposium on Theory of Computing, pages 654–663, El Paso, Texas, USA, 4–6 May 1997.
J. S. Karlsson, W. Litwin, and T. Risch. LH*lh: A scalable high performance data structure for switched multicomputers. IDA Technical Report 1995 LiTH-IDA-R-95–25, Department of Computer and Information Science, Linköping University, S-581 83 Linkping, Sweden, 1995. ISSN-0281–4250.
K. Keeton, D. A. Patterson, and J. M. Hellerstein. A case for intelligent disks (IDISKs). SIGMOD Record, 27 (3): 42–52, Aug. 1998.
U. Kohl, J. B. Lotspiech, and M. A. Kaplan. Safeguarding digital library contents and users: Protecting documents rather than channels. D-Lib Magazine, Sept. 1997. urn:hdl:cnri.dlib/september97-lotspiech.
B. Kröll and P. Widmayer. Distributing a search tree among a growing number of processors. In Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data, pages 265–276, Minneapolis, Minnesota, USA, 24–27 May 1994.
J. Kubiatowicz, D. Bindel, Y. Chen, P. Eaton, D. Geels, R. Gummadi, S. Rhea, H. Weatherspoon, W. Weimer, C. Wells, and B. Zhao. Oceanstore: An architecture for global-scale persistent storage. In Proceedings of ACMASPLOS, Cambridge, Massachusetts, USA, Nov. 2000.
C. Lagoze. A secure repository design for digital libraries. D-Lib Magazine, Dec. 1995. urn:hd1:cnri.dlib/december95-lagoze.
C. Lagoze, C. A. Lynch, and R. Daniel, Jr. The Warwick framework: A container architecture for aggregating sets of metadata. Computer Science Technical Report TR96–1593, Cornell University, June 1996.
C. Lagoze and S. Payette. An infrastructure for open-architecture digital libraries. Technical Report TR98–1690, Department of Computer Science, Cornell University, 1998. urn:hdl:nostrl.cornell/TR98-1690.
C. Lagoze, E. Shaw, J. R. Davis, and D. B. Krafft. Dienst: Implementation reference manual. Technical report, Cornell University, May 1995. urn:hdl: ncstrl. cornell/TR95–1514.
P. Larson. Dynamic hash tables. Communications of the ACM, 31 (4): 446–457, Apr. 1988.
B. M. Leiner. Metrics and the digital library. D-Lib Magazine, July 1998. Guest editorial. urn:hdl:cnri.dlib/july98-editorial.
B. M. Leiner. The NCSTRL approach to open architecture for the confederated digital library. D-Lib Magazine,Dec. 1998. urn:hdl:cnri.dlib/december98-Leiner.
W. Litwin. Linear hashing: A new tool for file and table addressing. In Proceedings of Very Large Databases, Montreal, Canada, 1980.
W. Litwin, J. Menon, and T. Risch. LH* schemes with scalable availability. Research Report RJ 1021 (91937), IBM Almaden, May 1998.
W. Litwin, J. Menon, T. Risch, and T. J. E. Schwarz. Design issues for scalable availability LH* schemes with record grouping. In DIMACS Workshop on Distributed Data and Structures, Princeton University, May 1999. Carleton Scientific.
W. Litwin and M.-A. Neimat. High-availability LH* schemes with mirroring. In Proceedings of the Conference on Cooperative Information Systems (COOPIS ‘86), Jan. 1996.
W. Litwin, M.-A. Neimat, G. Levy, S. Ndiaye, and T. Seck. LH*s: A high-availability and high-security scalable distributed data structure. In Proceedings of IEEE RIDE ‘87, 1997.
W. Litwin, M.-A. Neimat, and D. A. Schneider. LH*—linear hashing for distributed files. In Proceedings of SIGMOD ‘83, pages 327–336. ACM, May 1993.
W. Litwin, M.-A. Neimat, and D. A. Schneider. RP*: A family of order-preserving scalable distributed data structures. In Proceedings of Very Large Databases, Sept. 1994.
W. Litwin, M.-A. Neimat, and D. A. Schneider. LH*—a scalable, distributed data structure. ACM Transactions on Database Systems, 21 (4): 480–525, Dec. 1996.
W. Litwin and T. J. E. Schwarz. LH*rs: A high-availability scalable distributed data structure using Reed Solomon codes. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA, 2000.
K. Maly, M. L. Nelson, and M. Zubair. Smart Objects, Dumb Archives: A user-centric, layered digital library framework. D-Lib Magazine, 5(3), Mar. 1999. urn:doi:10.1045/march99maly.
E. Miller. An introduction to the resource description framework. D-Lib Magazine, May 1998. urn:hdl:cnri.dlib/may98-miller.
R. Moore, C. Baru, A. Rajasekar, B. Ludaescher, R. Marciano, M. Wan, W. Schroeder, and A. Gupta. Collection-based persistent digital archives - part 1. D-Lib Magazine, 6(3), Mar. 2000. urn:doi:10.1045/march2000-moore-ptl.
M. L. Nelson, K. Maly, S. N. T. Shen, and M. Zubair. NCSTRL+: Adding multi–discipline and multi–genre support to the Dienst protocol using clusters and buckets. In Proceedings of the IEEE Forum on Research and Technology Advances in Digital Libraries, IEEE ADL ‘88, pages 128–136, Santa Barbara, California, USA, 22–24 Apr. 1998. IEEE Computer Society. ISBN 0–8186–8464–X.
Oracle Corporation. Oracle Internet File System (iFS). Product Overview, 2000. http://www.oracle.com/database/options/ifs/iFSFO.htm1.
A. Paepcke, M. Q. W. Baldonado, C.-C. K. Chang, S. Cousins, and H. Garcia-Molina. Using distributed objects to build the Stanford digital library Infobus. IEEE Computer, 32 (2): 80–87, Feb. 1999.
N. Paskin. DOI: Current status and outlook. D-Lib Magazine, May 1999. urn:doi:10.1045/may99-paskin.
N. Paskin. Toward unique identifiers. Proceedings of the IEEE, 87 (7), July 1999.
S. Payette and C. Lagoze. Flexible and Extensible Digital Object and Repository Architecture (FEDORA). In Second European Conference on Research and Advanced Technology for Digital Libraries, Lecture Notes in Computer Science v1513, Heraklion, Crete, 21–23 Sep. 1998. Springer. http://www2.cs.cornell.edu/payette/papers/ECDL98/FEDORA.htm1.
C. G. Plaxton, R. Rajaraman, and A. W. Richa. Accessing nearby copies of replicated objects in a distributed environment. In ACM Symposium on Parallel Algorithms and Architectures, pages 311–320, June 1997.
A. Powell. Resolving DOI based URNs using Squid: An experimental system at UKOLN. D-Lib Magazine, June 1998. urn:hdl:cnri.dlib"/june98-powell.
M. O. Rabin. Efficient dispersal of information for security, load balancing, and fault tolerance. J. ACM, 36 (2): 335–348, Apr. 1989.
S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker. A scalable content-addressable network. In SIGCOMM ‘01, pages 161–172, San Diego, California, USA, 27–31 Aug. 2001. Association for Computing Machinery.
E. Riedel and G. A. Gibson. Active disks - remote execution for network-attached storage. Technical Report CMU-CS-97–198, Carnegie Mellon University, Dec. 1997.
J. R. Santos and R. Muntz. Performance analysis of the RIO multimedia storage system with heterogeneous disk configurations. In Proceedings of the Sixth ACM International Conference on Multimedia, pages 303–308, Bristol, United Kingdom, 13–16 Sept. 1998. ACM.
J. R. Santos, R. Muntz, and S. Benson. A parallel disk storage system for real-time multimedia applications. In Special Issue on Multimedia Computing Systems of the International Journal of Intelligent Systems, 1998.
K. Shafer, S. Weibel, E. Jul, and J. Fausey. Introduction to Persistent Uniform Resource Locators. WWW page, 1996. http://purl.ocic.org/OCLC/PURL/INET96.
Sibert, D. Bernstein, and D. Van Wie. Securing the content, not the wire, for information commerce. Technical report, InterTrust Technologies Corp., 1996. http://www.intertrust.com/architecture/stc.html.
I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-topeer lookup service for internet applications. In SIGCOMM ‘01, San Diego, California, USA, 2731 Aug. 2001. Association for Computing Machinery.
S. X. Sun and L. Lannom. Handle system overview. IETF Draft, Aug. 2000. http://www.ietf.org/internet-drafts/draft-sun-handle-system-05.txt.
S. X. Sun, S. Reilly, and L. Lannom. Handle system namespace and service definition. IETF Draft, Aug. 2000. http://www.ietf.org/internet-drafts/draft-sun-handle-system-def-03.txt.
H. Thiele. The Dublin Core and Warwick Framework. D-Lib Magazine, Jan. 1998. urn:hdl:cnri.dlib/january98-thiele.
E. Upfal and A. Widgerson. How to share memory in a distributed system. J. ACM, 34 (1): 116–127, Jan. 1987.
URN Implementors. Uniform resource names: A progress report. D-Lib Magazine,Feb. 1996. urn:hdl:cngi.dlib/february96-urnimplementors.
A. Vandat, T. Anderson, and M. Dahlin. Active Names: Programmable location and transport of wide-area resources. Technical report, University of California at Berkeley, 1998.
R. Vingralek, Y. Breitbart, and G. Weikum. Distributed file organization with scalable cost/performance. In Proceedings ofACM-SIGMOD, pages 253–264, May 1994.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Fox, E.A., Mather, P. (2003). Object Repositories for Digital Libraries. In: Feng, D.D., Siu, WC., Zhang, HJ. (eds) Multimedia Information Retrieval and Management. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-05300-3_13
Download citation
DOI: https://doi.org/10.1007/978-3-662-05300-3_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05533-1
Online ISBN: 978-3-662-05300-3
eBook Packages: Springer Book Archive