Abstract
It is known that large amount of unique fingerprints are generated during deduplication for storage on the cloud. This large number of fingerprints is often tackled using bloom filters. Usually bloom filters are implemented using Random Access Memory (RAM) which is limited and expensive. This leads to higher false positive probability which decreases the deduplication efficiency. In this paper, a Flash Assisted Segmented Bloom Filter for Deduplication (FASBF) is proposed which implements its bloom filter (BF) on solid state drive where only part of the whole bloom filter will be kept in RAM while the full bloom filter is on solid state drive. This improves duplicate lookup in three ways. First the size of the bloom filter can be sufficiently large. Second, more number of hash functions can be used. And last, there will be more RAM space for fingerprint cache. This approach is evaluated using a prototype implemented on a study oriented network backup system. The result shows that this approach saves a considerable amount of memory space while satisfying an underlying 100MB/s backup throughput.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Broder, A., Mitzenmacher, M.: Network Applications of Bloom Filters: A Survey (2003)
Canim, M., Mihalia, G.A., Bhattacharjee, B., Lang, C.A., Ross, K.A.: Buffered bloom filters on solid state storage (2010)
Debnath, B., Du, D.H.C., Lu, G.: A Forest-structured Bloom Filter with Flash Memory. In: Mass Storage Systems and Technologies, MSST (2011)
Mokbel, M.F., Lilja, D.J., Du, D., Debnath, B.: Deferred updates for flash-based storage. In: Mass Storage Systems and Technologies (MSST), Washington, DC, USA (2010)
Jiang, H., Zhou, K., Feng, D., Wei, J.: MAD2: A Scalable High-Throughput Exact Deduplication Approach for Network Backup Services. In: 26th IEEE MSST, Incline Village, NV, USA (May 2010)
Li., K., Zhu, B.: Avoiding the Disk Bottleneck in the Data Domain Deduplication File System. In: 6th USENIX Conference on File and Storage Technologies
Policroniades, C., Pratt, I.: Alternatives for Detecting Redundancy in Storage Systems Data. In: Proceedings of the 2004 USENIX Annual Technical Conference, Boston, MA, USA (June 2004)
Chazelle, B., Kilian, J., Rubinfeld, R., Tal, A.: The bloomier filter: an efficient data structure for static support lookup tables. In: Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Descrete Algorithms, Philaderlphia, USA, pp. 30–39 (2004)
Meister, D., Brinkmann, A.: dedupv1: Improving deduplication throughput using solid state drives (SSD). In: MSST 2010 Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), Washington, DC, USA, pp. 1-6 (2010)
Debnath, B., Sengupta, S., Li, J.: ChunkStash: Speeding up Inline Storage Deduplication using Flash Memory. In: 2010 USENIX Annual Technical Conference (ATC) (June 2010)
Debnath, B., Sengupta, S., Li, J., Lilja, D.J., Du. BloomFlash, D.: Bloom Filter on Flash-based Storage. In: International Conference on Distributed Computing Systems (ICDCS), Minneapolis, USA (2011)
Tarkoma, S., Rothenberg, C.E., Lagerspetz, E.: Theory and Practice of Bloom Filters for Distributed Systems. IEEE Communications Surveys & Tutorials 14(1), 131–155 (2012)
Rothenberg, C.E., Macapuna, C.A.B., Verdi, F.L., Magalhães, M.F.: The deletable Bloom filter: a new member of the Bloom family. IEEE Communications Letters 14(6), 557–559 (2010)
Xia, W., Jiang, H., Feng, D., Hua, Y.: Silo, A similarity-locality based near-exact deduplication scheme with low ram overhead and high throughput. In: USENIX Conference on USENIX Annual Technical Conference, Berkeley, CA, USA
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Dagnaw, G., Teferi, A., Berhan, E. (2015). Flash Assisted Segmented Bloom Filter for Deduplication. In: Abraham, A., Krömer, P., Snasel, V. (eds) Afro-European Conference for Industrial Advancement. Advances in Intelligent Systems and Computing, vol 334. Springer, Cham. https://doi.org/10.1007/978-3-319-13572-4_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-13572-4_7
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13571-7
Online ISBN: 978-3-319-13572-4
eBook Packages: EngineeringEngineering (R0)