In this chapter, we summarize design and implementation components for efficient deduplication framework on data chains from clients to servers through networks. We point out key points and strength of three components on server side (Hybird Email Deduplication System - HEDS), client side (Structure-Aware File and Email Deduplication for Cloud-based Storage System - SAFE), and network side (Software-Defined Deduplication as as Nework and Storage Service - SoftDance). We end this chapter by describing promising future directions.
In this era of data explosion, huge redundancies exist on storage devices and networks. Existing deduplication solutions, such as storage data deduplication and network redundancy elimination, are not as efficient as they could be at optimizing data movement from clients through networks to servers.
We have designed and proposed an efficient deduplication framework to optimize data chains from clients to servers through networks and to make components for the framework. We developed components such as Hybrid Email Deduplication System (HEDS) on the server side, Structure-Aware File and Email Deduplication for Cloud-based Storage Systems (SAFE) on the client side, and Software-Defined Deduplication as a Network and Storage Service (SoftDance) on the network side for the deduplication framework. HEDS efficiently achieves a trade-off of file-level and block deduplication for email systems. SAFE exploits structure-based granularity rather than using physical chunk granularity, which enables it to perform very fast file-level deduplication and provide the same space savings as block deduplication with low overhead. SoftDance, as an in-network deduplication, chains storage data deduplication and network redundancy elimination functions using a software-defined network (SDN) and achieves storage space savings and network bandwidth savings with low processing time and memory overhead on storage devices and networks. We also presented a mobile deduplication method focusing on popular files such as image and video files on mobile devices.
Further investigations and studies on deduplication are in order, especially in connection with network and system reliability, storage workload and scalability, and efficient accessibility in multi-user cloud environments.