Advertisement

Web Workload Characterization: Ten Years Later

  • Adepele Williams
  • Martin Arlitt
  • Carey Williamson
  • Ken Barker
Chapter
Part of the Web Information Systems Engineering and Internet Technologies Book Series book series (WISE, volume 2)

Abstract

In 1996, Arlitt and Williamson [Arlitt et al., 1997] conducted a comprehensive workload characterization study of Internet Web servers. By analyzing access logs from 6 Web sites (3 academic, 2 research, and 1 industrial) in 1994 and 1995, the authors identified 10 invariants: workload characteristics common to all the sites that are likely to persist over time. In this present work, we revisit the 1996 work by Arlitt and Williamson, repeating many of the same analyses on new data sets collected in 2004. In particular, we study access logs from the same 3 academic sites used in the 1996 paper. Despite a 30-fold increase in overall traffic volume from 1994 to 2004, our main conclusion is that there are no dramatic changes in Web server workload characteristics in the last 10 years. Although there have been many changes in Web technologies (e.g., new protocols, scripting languages, caching infrastructures), most of the 1996 invariants still hold true today. We postulate that these invariants will continue to hold in the future, because they represent fundamental characteristics of how humans organize, store, and access information on the Web.

Keywords

Web servers workload characterization 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arlitt, M. and Williamson, C. (1997) Internet Web Servers: Workload Characterization and Performance Implications. IEEE/ACM Transactions on Networking, Vol.5,No. 5, pp.631–645.CrossRefGoogle Scholar
  2. Barford, P., Bestavros, A., Bradley, A. and Crovella, M. (1999) Changes in Web Client Access Patterns: Characteristics and Caching Implications. World Wide Web Journal, Special Issue on Characterization and Performance Evaluation, pp. 15–28.Google Scholar
  3. Cherkasova, L. and Karlsson, M. (2001) Dynamics and Evolution of Web Sites: Analysis, Metrics and Design Issues. Proceedings of the 6th IEEE Symposium on Computers and Communications, Hammamet, Tunisia, pp. 64–71.Google Scholar
  4. Crovella, M. and Taqqu, M. (1999) Estimating the Heavy Tail Index from Scaling Properties. Methodology and Computing in Applied Probability, Vol. 1,No. 1, pp. 55–79.MathSciNetCrossRefGoogle Scholar
  5. Harel, N., Vellanki, V., Chervenak, A., Abowd, G. and Ramachandran, U. (1999) Workload of a Media-Enhanced Classroom Server. Proceedings of the 2nd IEEE Workshop on Workload Characterization, Austin, TX.Google Scholar
  6. Hernandez-Campos, F., Jeffay, K. and Donelson-Smith, F. (2003) Tracking the Evolution of Web Traffic: 1995-2003. Proceedings of 11th IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunications Systems (MASCOTS), Orlando, FL, pp. 16–25.Google Scholar
  7. Mahanti, A., Eager, D. and Williamson, C. (2000) Temporal Locality and its Impact on Web Proxy Cache Performance. Performance Evaluation, Special Issue on Internet Performance Modeling, Vol. 42,No. 2/3, pp. 187–203.CrossRefGoogle Scholar
  8. Montgomery, D., Runger, G. and Hubele, N. (2001) Engineering Statistics. John Wiley and Sons, New York.Google Scholar
  9. Moore, G. (1965) Cramming More Components onto Integrated Circuits. Electronics, Vol. 38No. 8, pp. 114–117.Google Scholar
  10. Odlyzko, A. (2003) Internet Traffic Growth: Sources and Implications. Proceedings of SPIE Optical Transmission Systems and Equipment for WDM Networking II, Vol. 5247, pp. 1–15.Google Scholar
  11. Paxson, V. and Floyd, S. (1995) Wide-area Traffic: The Failure of Poisson Modeling. IEEE/ACM Transactions on Networking, Vol. 3,No. 3, pp. 226–244.CrossRefGoogle Scholar
  12. Pitkow, J. (1998) Summary of WWW Characterizations. Proceedings of the Seventh International World Wide Web Conference, Brisbane, Australia, pp. 551–558.Google Scholar
  13. Press, L. (2000) The State of the Internet: Growth and Gaps. Proceedings of INET 2000, Japan. Available at http://www.isoc.org/inet2000/cdproceedings/8e/8e_4.htm\#s21.Google Scholar
  14. Romeu, J. (2003) Anderson-Darling: A Goodness of Fit Test for Small Samples Assumptions. Selected Topics in Assurance Related Technologies, Vol. 10,No. 5, DoD Reliability Analysis Center. Available at http://rac.alionscience.com/pdf/A_DTest.pdf.Google Scholar
  15. Schaller, B. (1996) The Origin, Nature, and Implications of Moore’s Law. Available at http://mason.gmu.edu/∼rschalle/moorelaw.html.Google Scholar
  16. Williamson, C. (2002) On Filter Effects in Web Caching Hierarchies. ACM Transactions on Internet Technology, Vol. 2,No. 1, pp. 47–77.CrossRefGoogle Scholar
  17. Zipf, G. (1949) Human Behavior and the Principle of Least Effort. Addison-Wesley Press, Inc., Cambridge, MA.Google Scholar

Copyright information

© Springer Science+Business Media, Inc. 2005

Authors and Affiliations

  • Adepele Williams
    • 1
  • Martin Arlitt
    • 1
  • Carey Williamson
    • 1
  • Ken Barker
    • 1
  1. 1.Department of Computer ScienceUniversity of CalgaryCalgaryCanada

Personalised recommendations