Abstract
Web access log analysis is gaining popularity, especially with the growing number of commercial web sites selling their products. The driver for this increase in interest is the promise of gaining some insights into the behaviour of users/customers when browsing through their Web site, fuelled by the desire to improve the user experience. In this chapter we describe the approach taken in analysing web access logs of a non-commercial Web site disseminating Portuguese statistical data. In developing the approach, we follow the common steps for data mining applications (the CRISP-DM phases), and give details about several phases involved in developing the data mining solution. Through intensive communication with the web site owner, we identified three data mining problems which were successfully addressed using different tools and methods. The solution methodology is briefly described here accompanied with some of the results for illustrative purposes. We conclude with an attempt to generalize our experience and provide a number of lessons learned.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Breese, J. S., Heckerman, D. and Kadie, C. (1998). Empirical analysis of predictive algorithms for collaborative filtering. Proc. Fourteenth Annual Conference on Uncertainty in Artificial Intelligence. 43–52.
Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C. and Wirth, R. (2000). CRISP-DM 1.0: Step-by-step data mining guide, CRISP-DM consortium, http://www.crisp-dm.org
Grobelnik, M. and Mladenić D. (2002a). Efficient visualization of large text corpora. Proc. 7th TELRI seminar. Dubrovnik, Croatia.
Grobelnik, M. and Mladenić D. (2002b). Visualization and collaborative filtering for web mining tasks. Proc. Information Society IS-2002: Data Mining and Warehouses, (eds. Mladenić, D. and Grobelnik, M.), Ljubljana, Slovenia.
INE (2002). Instituto Nacional de Estatística, Nova versão para o Infoline 2002, (in Portuguese), INEWS, Vol. 5.
Jorge, A., Alves, M. A. and Azevedo, P. (2002). Recommendation with Association Rules: A web mining application. Proc. Information Society IS-2002: Data Mining and Warehouses, (eds. Mladenić D. and Grobelnik, M.), Ljubljana, Slovenia.
Kohavi, R. (2001). The Good, the Bad and the Ugly. Proc. KDD 2001. ACM Press.
Mena, J. (1999). Data Mining Your Website, Digital Press.
Mladenić, D. (1999). Text-learning and related intelligent agents, In IEEE EXPERT, Special Issue on Applications of Intelligent Information Retrieval, July-August, 1999.
Mobasher, B., Cooley, R. and Srivastava, R. (1999). Data Preparation for Mining World Wide Web Browsing Patterns, Journal of Knowledge and Information Systems, Vol. 1, No. 1, http://maya.cs.depaul.edu/~mobasher/pubs-subject.html/~mobasher/pubs-subject.html.
Mobasher, B., Dai, H., Luo, T. and Nakagawa, M. (2001). Improving the effectiveness of collaborative filtering on anonymous web usage data. Proc. IJCAI’s Seventh Workshop on Intelligent Techniques for Web Personalization. Seattle, Washington.
Sarwar, B., Karypis, G., Konstan, J. and Riedl, J. (2000). Analysis of recommendation algorithms for e-commerce. Proc. ACM Conference on Electronic Commerce.
Spiliopolou, M. and Pohle, C. (2001). Data Mining for Measuring and Improving the Success of Web Sites, Data mining and Knowledge Discovery, Vol. 5, 85–114.
Steinbach, M., Karypis, G. and Kumar, V. (2000). A comparison of document clustering techniques. Proc. KDD Workshop on Text Mining, (eds. Grobelnik, M., Mladenić, D. and Milic-Frayling, N.), Boston, MA, USA, 109–110.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer Science+Business Media New York
About this chapter
Cite this chapter
Jorge, A., Alves, M.A., Grobelnik, M., Mladenić, D., Petrak, J. (2003). Web Site Access Analysis for a National Statistical Agency. In: Mladenić, D., Lavrač, N., Bohanec, M., Moyle, S. (eds) Data Mining and Decision Support. The Springer International Series in Engineering and Computer Science, vol 745. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0286-9_14
Download citation
DOI: https://doi.org/10.1007/978-1-4615-0286-9_14
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-5004-0
Online ISBN: 978-1-4615-0286-9
eBook Packages: Springer Book Archive