Abstract
This paper introduces a workload characterization study of the most popular short video sharing service of Web 2.0, YouTube. Based on a vast amount of data gathered in a five-month period, we analyzed characteristics of around 250,000 YouTube popular and regular videos. In particular, we collected lists of related videos for each video clip recursively and analyzed their statistical behavior. Understanding YouTube traffic and similar Web 2.0 video sharing sites is crucial to develop synthetic workload generators. Workload simulators are required for evaluating the methods addressing the problems of high bandwidth usage and scalability of Web 2.0 sites such as YouTube. The distribution models, in particular Zipf-like behavior of YouTube popular video files suggests proxy caching of YouTube popular videos can reduce network traffic and increase scalability of YouTube Web site. YouTube workload characteristics provided in this work enabled us to develop a workload generator to evaluate the effectiveness of this approach.
Similar content being viewed by others
References
API Documentation (YouTube). http://youtube.com/dev docs.
Cha M, Kwak H, Rodriguez P, Ahn Y, and Moon S (2007) “I Tube, You Tube, Everybody Tubes: Analyzing the World’s Largest User Generated Content Video System,” In Proceedings of the 7 th ACM SIGCOMM conference on Internet Measurement, pp. 1-14, San Diego, USA.
Chattopadhyay S, Ramaswamy L, and Bhandarkar SM (2007) “A Framework for Encoding and Caching of Video of Quality Adaptive Progressive Download”, In Proceedings of the 15th international conference on Multimedia, Germany.
Cheng X, Dale C, and Liu J (2007) “Understanding the Characteristics of Internet Short Video Sharing: YouTube as a Case Study,” Technical Report arXiv: 0707.3670v1 [cs.NI], Cornell University, arXiv e-prints.
Gill P, Arlitt M, Li Z and Mahanti A, (2007) “YouTube Traffic Characterization: A View From the Edge,” In Proceedings of the 7th ACM SIGCOMM conference on Internet measurement, pp. 15-28, San Diego, USA.
Gomes L (2006) “Will All of Us Get Our 15 Min On a YouTube Video?” Wall Street Journal.
Halvey M and Keane M (2007) “Exploring Social Dynamics in Online Media Sharing,” In Proceedings of the 16th international conference on World Wide Web, Banff, Canada.
Law AM, Kenton WD (2000) Simulation Modeling and Analysis, 3rd edn. McGraw-Hill, Boston, pp 292–402
Sen S, Rexford J, Towsley D (1999) “Proxy Prefix Caching for Multimedia Streams.” In Proceedings of the 18th IEEE Conference on Computer Communications (INFOCOM’99), Volume 3, pp. 1310-1319, New York, NY, USA.
The Wall Street Journal (from Wikipedia). http://en.wikipedia.org/wiki/The_Wall_Street_Journal
YouTube: Video Format (from Wikipedia). http://en.wikipedia.org/wiki/Youtube#Video format
Zink M Suh K and Kurose J (2008) “Watch Global, Cache Local: YouTube Network Traffic at a Campus Network - Measurement and Implications,” In Proceedings of ACM/SPIE MMCN ’08 conference, Volume 6818, pp. 5-13 San Jose, USA.
Acknowledgements
We would like to thank the anonymous reviewers of this paper for their suggestions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Abhari, A., Soraya, M. Workload generation for YouTube. Multimed Tools Appl 46, 91–118 (2010). https://doi.org/10.1007/s11042-009-0309-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-009-0309-5