Synonyms
Dedup; Single instancing
Definition
The term Deduplication refers to the task of eliminating redundant data in data storage so as to reduce the required capacity. The benefits of Deduplication include saving rack space and power consumption of the data storage. Deduplication is often implemented in archival storage systems such as content-addressable storage (CAS) systems and virtual tape libraries (VTLs). The term Deduplication is sometimes shortened to Dedup.
Key Points
An address mapping table and a hash index are often used for implementing Deduplication. The address mapping table converts a logical address to a physical location for each block, and the hash index converts a hash value to a physical location for each block. When a block X is to be written to the data storage, a hash value is calculated from the content of X, and then the hash index is searched. If the same hash value is not found in the hash index, a new block is allocated in the storage space, X is...
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsRecommended Reading
Diligent Technologies. Hyper Factor: a breakthrough in data reduction technology. White Paper. 2008.
Patterson H. Dedupe-centric storage for general applications. White Paper, Data Domain. 2008.
Quinlan S, Dorward S. Venti: a new approach to archival storage. In: Proceedings of the 1st USENIX Conference on File and Storage Technologies; 2002. p. 89–102.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Goda, K. (2018). Deduplication. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_1323
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_1323
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering