Abstract
As tools for personal storage, file synchronization and data sharing, cloud storage services such as Dropbox have quickly gained popularity. These services provide users with ubiquitous, reliable data storage that can be automatically synced across multiple devices, and also shared among a group of users. To minimize the network overhead, cloud storage services employ binary diff, data compression, and other mechanisms when transferring updates among users. However, despite these optimizations, we observe that in the presence of frequent, short updates to user data, the network traffic generated by cloud storage services often exhibits pathological inefficiencies. Through comprehensive measurements and detailed analysis, we demonstrate that many cloud storage applications generate session maintenance traffic that far exceeds the useful update traffic. We refer to this behavior as the traffic overuse problem. To address this problem, we propose the update-batched delayed synchronization (UDS) mechanism. Acting as a middleware between the user’s file storage system and a cloud storage application, UDS batches updates from clients to significantly reduce the overhead caused by session maintenance traffic, while preserving the rapid file synchronization that users expect from cloud storage services. Furthermore, we extend UDS with a backwards compatible Linux kernel modification that further improves the performance of cloud storage applications by reducing the CPU usage.
Chapter PDF
References
Dropbox-as-a-Database, the tutorial, http://blog.opalang.org/2012/11/dropbox-as-database-tutorial.html
Dropbox CLI (Command Line Interface), http://www.dropboxwiki.com/Using_Dropbox_CLI
Dropbox client (Ubuntu Linux version), http://linux.dropbox.com/packages/ubuntu/nautilus-dropbox_0.7.1_i386.deb
Dropbox is now the data fabric tying together devices for 100M registered users who save 1B files a day, http://techcrunch.com/2012/11/13/dropbox-100-million
Dropbox traces, http://traces.simpleweb.org/wiki/Dropbox_Traces
DropboxTeams, http://dropbox.com/teams
fsnotify git hub, https://github.com/howeyc/fsnotify
inotify man page, http://linux.die.net/man/7/inotify
rsync web site, http://www.samba.org/rsync
Wireshark web site, http://www.wireshark.org
Bergen, A., Coady, Y., McGeer, R.: Client Bandwidth: The Forgotten Metric of Online Storage Providers. In: Proc. of PacRim (2011)
Bessani, A., Correia, M., Quaresma, B., André, F., Sousa, P.: DepSky: Dependable and Secure Storage in a Cloud-of-clouds. In: Proc. of EuroSys (2011)
Buyya, R., Yeo, C., Venugopal, S.: Market-oriented Cloud Computing: Vision, Hype, and Reality for Delivering IT Services as Computing Utilities. In: Proc. of HPCC (2008)
Calder, B., et al.: Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency. In: Proc. of SOSP (2011)
Chen, Y., Srinivasan, K., Goodson, G., Katz, R.: Implications for Enterprise Storage Systems via Multi-dimensional Trace Analysis. In: Proc. of SOSP (2011)
Drago, I., Bocchi, E., Mellia, M., Slatman, H., Pras, A.: Benchmarking Personal Cloud Storage. In: Proc. of IMC (2013)
Drago, I., Mellia, M., Munafò, M.M., Sperotto, A., Sadre, R., Pras, A.: Inside Dropbox: Understanding Personal Cloud Storage Services. In: Proc. of IMC (2012)
Halevi, S., Harnik, D., Pinkas, B., Shulman-Peleg, A.: Proofs of Pwnership in Remote Storage Systems. In: Proc. of CCS (2011)
Harnik, D., Kat, R., Sotnikov, D., Traeger, A., Margalit, O.: To Zip or Not to Zip: Effective Resource Usage for Real-Time Compression. In: Proc. of FAST (2013)
Harnik, D., Pinkas, B., Shulman-Peleg, A.: Side Channels in Cloud Services: Deduplication in Cloud Storage. IEEE Security & Privacy 8(6), 40–47 (2010)
Hu, W., Yang, T., Matthews, J.: The Good, the Bad and the Ugly of Consumer Cloud Storage. ACM SIGOPS Operating Systems Review 44(3), 110–115 (2010)
Jackson, K., et al.: Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud. In: Proc. of CloudCom (2010)
Li, A., Yang, X., Kandula, S., Zhang, M.: CloudCmp: Comparing Public Cloud Providers. In: Proc. of IMC (2010)
Mahajan, P., et al.: Depot: Cloud Storage with Minimal Trust. ACM Transactions on Computer Systems (TOCS) 29(4), 12 (2011)
Mulazzani, M., Schrittwieser, S., et al.: Dark Clouds on the Horizon: Using Cloud Storage as Attack Vector and Online Slack Space. In: Proc. of USENIX Security (2011)
Placek, M., Buyya, R.: Storage Exchange: A Global Trading Platform for Storage Services. In: Proc. of EuroPar (2006)
Shilane, P., Huang, M., Wallace, G., Hsu, W.: WAN Optimized Replication of Backup Datasets Using Stream-informed Delta Compression. In: Proc. of FAST (2012)
Vrable, M., Savage, S., Voelker, G.M.: Cumulus: Filesystem Backup to the Cloud. ACM Transactions on Storage (TOS) 5(4), 14 (2009)
Vrable, M., Savage, S., Voelker, G.: Bluesky: A Cloud-backed File System for the Enterprise. In: Proc. of FAST (2012)
Wallace, G., Douglis, F., Qian, H., Shilane, P., Smaldone, S., et al.: Characteristics of Backup Workloads in Production Systems. In: Proc. of FAST (2012)
Wang, H., Shea, R., Wang, F., Liu, J.: On the Impact of Virtualization on Dropbox-like Cloud File Storage/Synchronization Services. In: Proc. of IWQoS (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 IFIP International Federation for Information Processing
About this paper
Cite this paper
Li, Z. et al. (2013). Efficient Batched Synchronization in Dropbox-Like Cloud Storage Services. In: Eyers, D., Schwan, K. (eds) Middleware 2013. Middleware 2013. Lecture Notes in Computer Science, vol 8275. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45065-5_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-45065-5_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45064-8
Online ISBN: 978-3-642-45065-5
eBook Packages: Computer ScienceComputer Science (R0)