Skip to main content

Web Site Access Analysis for a National Statistical Agency

  • Chapter
Data Mining and Decision Support

Abstract

Web access log analysis is gaining popularity, especially with the growing number of commercial web sites selling their products. The driver for this increase in interest is the promise of gaining some insights into the behaviour of users/customers when browsing through their Web site, fuelled by the desire to improve the user experience. In this chapter we describe the approach taken in analysing web access logs of a non-commercial Web site disseminating Portuguese statistical data. In developing the approach, we follow the common steps for data mining applications (the CRISP-DM phases), and give details about several phases involved in developing the data mining solution. Through intensive communication with the web site owner, we identified three data mining problems which were successfully addressed using different tools and methods. The solution methodology is briefly described here accompanied with some of the results for illustrative purposes. We conclude with an attempt to generalize our experience and provide a number of lessons learned.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Breese, J. S., Heckerman, D. and Kadie, C. (1998). Empirical analysis of predictive algorithms for collaborative filtering. Proc. Fourteenth Annual Conference on Uncertainty in Artificial Intelligence. 43–52.

    Google Scholar 

  • Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C. and Wirth, R. (2000). CRISP-DM 1.0: Step-by-step data mining guide, CRISP-DM consortium, http://www.crisp-dm.org

    Google Scholar 

  • Grobelnik, M. and Mladenić D. (2002a). Efficient visualization of large text corpora. Proc. 7th TELRI seminar. Dubrovnik, Croatia.

    Google Scholar 

  • Grobelnik, M. and Mladenić D. (2002b). Visualization and collaborative filtering for web mining tasks. Proc. Information Society IS-2002: Data Mining and Warehouses, (eds. Mladenić, D. and Grobelnik, M.), Ljubljana, Slovenia.

    Google Scholar 

  • INE (2002). Instituto Nacional de Estatística, Nova versão para o Infoline 2002, (in Portuguese), INEWS, Vol. 5.

    Google Scholar 

  • Jorge, A., Alves, M. A. and Azevedo, P. (2002). Recommendation with Association Rules: A web mining application. Proc. Information Society IS-2002: Data Mining and Warehouses, (eds. Mladenić D. and Grobelnik, M.), Ljubljana, Slovenia.

    Google Scholar 

  • Kohavi, R. (2001). The Good, the Bad and the Ugly. Proc. KDD 2001. ACM Press.

    Google Scholar 

  • Mena, J. (1999). Data Mining Your Website, Digital Press.

    Google Scholar 

  • Mladenić, D. (1999). Text-learning and related intelligent agents, In IEEE EXPERT, Special Issue on Applications of Intelligent Information Retrieval, July-August, 1999.

    Google Scholar 

  • Mobasher, B., Cooley, R. and Srivastava, R. (1999). Data Preparation for Mining World Wide Web Browsing Patterns, Journal of Knowledge and Information Systems, Vol. 1, No. 1, http://maya.cs.depaul.edu/~mobasher/pubs-subject.html/~mobasher/pubs-subject.html.

  • Mobasher, B., Dai, H., Luo, T. and Nakagawa, M. (2001). Improving the effectiveness of collaborative filtering on anonymous web usage data. Proc. IJCAI’s Seventh Workshop on Intelligent Techniques for Web Personalization. Seattle, Washington.

    Google Scholar 

  • Sarwar, B., Karypis, G., Konstan, J. and Riedl, J. (2000). Analysis of recommendation algorithms for e-commerce. Proc. ACM Conference on Electronic Commerce.

    Google Scholar 

  • Spiliopolou, M. and Pohle, C. (2001). Data Mining for Measuring and Improving the Success of Web Sites, Data mining and Knowledge Discovery, Vol. 5, 85–114.

    Article  Google Scholar 

  • Steinbach, M., Karypis, G. and Kumar, V. (2000). A comparison of document clustering techniques. Proc. KDD Workshop on Text Mining, (eds. Grobelnik, M., Mladenić, D. and Milic-Frayling, N.), Boston, MA, USA, 109–110.

    Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer Science+Business Media New York

About this chapter

Cite this chapter

Jorge, A., Alves, M.A., Grobelnik, M., Mladenić, D., Petrak, J. (2003). Web Site Access Analysis for a National Statistical Agency. In: Mladenić, D., Lavrač, N., Bohanec, M., Moyle, S. (eds) Data Mining and Decision Support. The Springer International Series in Engineering and Computer Science, vol 745. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0286-9_14

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-0286-9_14

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-5004-0

  • Online ISBN: 978-1-4615-0286-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics