Analysis of Web Objects Distribution

  • Manuel Gómez ZotanoEmail author
  • Jorge Gómez Sanz
  • Juan Pavón
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 373)


Understanding how the web objects of a website are demanded is relevant for the design and implementation of techniques that assure a good quality of service. Several authors have studied generic profiles for web access, concluding that they resemble a Zipf distribution, but further evidences were missing. This paper contributes with additional empirical evidences that confirm that a Zipf distribution is present in different domains and that its form has changed from past studies. More specifically, the α parameter has become higher than one, as a consequence that the popularity factor has become more critical than before. This analysis also considers the impact of web technologies on the characterization of web traffic.


Web traffic analysis Web technology Zipf distribution Cache policy 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Arlitt, M., Jin, T.: A workload characterization study of the 1998 world cup web site. IEEE Network 14(3), 30–37 (2000)CrossRefGoogle Scholar
  2. 2.
    Breslau, L., Cao, P., Fan, L., Phillips, G., Shenker, S.: Web caching and Zipf-like distributions: Evidence and implications. In: Proceedings of the IEEE INFOCOM 1999, pp. 126–134. IEEE (March 1999)Google Scholar
  3. 3.
    Challenger, J.R., Dantzig, P., Iyengar, A., Squillante, M.S., Zhang, L.: Efficiently serving dynamic data at highly accessed web sites. IEEE/ACM Transactions on Networking 12(2), 233–246 (2004)CrossRefGoogle Scholar
  4. 4.
    Clauset, A., Shalizi, C.R., Newman, M.E.: Power-law distributions in empirical data. SIAM Review 51(4), 661–703 (2009)CrossRefzbMATHMathSciNetGoogle Scholar
  5. 5.
    Clauset, A., Shalizi, C.R., Newman, M.E.: Power-law Distributions,
  6. 6.
    Gill, P., Arlitt, M., Li, Z., Mahanti, A.: Youtube traffic characterization: a view from the edge. In: Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement, pp. 15–28. ACM (October 2007)Google Scholar
  7. 7.
    Huang, Q., Birman, K., van Renesse, R., Lloyd, W., Kumar, S., Li, H.C.: An analysis of Facebook photo caching. In: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (pp, pp. 167–181. ACM (November 2013)Google Scholar
  8. 8.
    Imbrenda, C., Muscariello, L., Rossi, D.: Analyzing cacheable traffic in isp access networks for micro cdn applications via content-centric networking. In: Proceedings of the 1st International Conference on Information-Centric Networking, pp. 57–66. ACM (September 2014)Google Scholar
  9. 9.
    Krashakov, S.A., Teslyuk, A.B., Shchur, L.N.: On the universality of rank distributions of website popularity. Computer Networks 50(11), 1769–1780 (2006)CrossRefzbMATHGoogle Scholar
  10. 10.
    Mahanti, A., Williamson, C., Eager, D.: Traffic analysis of a web proxy caching hierarchy. IEEE Network 14(3), 16–23 (2000)CrossRefGoogle Scholar
  11. 11.
    Mahanti, A., Carlsson, N., Arlitt, M., Williamson, C.: A tale of the tails: Power-laws in internet measurements. IEEE Network 27(1), 59–64 (2013)CrossRefGoogle Scholar
  12. 12.
    Nair, T.R., Jayarekha, P.: A rank based replacement policy for multimedia server cache using zipf-like law. arXiv preprint arXiv:1003.4062 (2010)Google Scholar
  13. 13.
    Podlipnig, S., Böszörmenyi, L.: A survey of web cache replacement strategies. ACM Computing Surveys (CSUR) 35(4), 374–398 (2003)Google Scholar
  14. 14.
    Roadknight, C., Marshall, I., Vearer, D.: File Popularity Characterisation. In: Proceedings of the 2nd Workshop on Internet Server Performance (WISP 1999), Atlanta, GA (May 1999)Google Scholar
  15. 15.
    Shi, L., Gu, Z.-M., Wei, L., Shi, Y.: Quantitative analysis of zipf’s law on web cache. In: Pan, Y., Chen, D.-X., Guo, M., Cao, J., Dongarra, J. (eds.) ISPA 2005. LNCS, vol. 3758, pp. 845–852. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  16. 16.
    Traverso, S., Ahmed, M., Garetto, M., Giaccone, P., Leonardi, E., Niccolini, S.: Temporal locality in today’s content caching: why it matters and how to model it. ACM SIGCOMM Computer Communication Review 43(5), 5–12 (2013)CrossRefGoogle Scholar
  17. 17.
    Urdaneta, G., Pierre, G., Van Steen, M.: Wikipedia workload analysis for decentralized hosting. Computer Networks 53(11), 1830–1845 (2009)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Manuel Gómez Zotano
    • 1
    Email author
  • Jorge Gómez Sanz
    • 2
  • Juan Pavón
    • 2
  1. 1.Corporación Radio Televisión EspañolaMadridSpain
  2. 2.Facultad de InformáticaUniversidad Complutense de MadridMadridSpain

Personalised recommendations