Web Site Access Analysis for a National Statistical Agency

Jorge, Alípio; Alves, Mário A.; Grobelnik, Marko; Mladenić, Dunja; Petrak, Johann

doi:10.1007/978-1-4615-0286-9_14

Alípio Jorge,
Mário A. Alves,
Marko Grobelnik,
Dunja Mladenić &
…
Johann Petrak

Part of the book series: The Springer International Series in Engineering and Computer Science ((SECS,volume 745))

287 Accesses
5 Citations

Abstract

Web access log analysis is gaining popularity, especially with the growing number of commercial web sites selling their products. The driver for this increase in interest is the promise of gaining some insights into the behaviour of users/customers when browsing through their Web site, fuelled by the desire to improve the user experience. In this chapter we describe the approach taken in analysing web access logs of a non-commercial Web site disseminating Portuguese statistical data. In developing the approach, we follow the common steps for data mining applications (the CRISP-DM phases), and give details about several phases involved in developing the data mining solution. Through intensive communication with the web site owner, we identified three data mining problems which were successfully addressed using different tools and methods. The solution methodology is briefly described here accompanied with some of the results for illustrative purposes. We conclude with an attempt to generalize our experience and provide a number of lessons learned.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Breese, J. S., Heckerman, D. and Kadie, C. (1998). Empirical analysis of predictive algorithms for collaborative filtering. Proc. Fourteenth Annual Conference on Uncertainty in Artificial Intelligence. 43–52.
Google Scholar
Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C. and Wirth, R. (2000). CRISP-DM 1.0: Step-by-step data mining guide, CRISP-DM consortium, http://www.crisp-dm.org
Google Scholar
Grobelnik, M. and Mladenić D. (2002a). Efficient visualization of large text corpora. Proc. 7th TELRI seminar. Dubrovnik, Croatia.
Google Scholar
Grobelnik, M. and Mladenić D. (2002b). Visualization and collaborative filtering for web mining tasks. Proc. Information Society IS-2002: Data Mining and Warehouses, (eds. Mladenić, D. and Grobelnik, M.), Ljubljana, Slovenia.
Google Scholar
INE (2002). Instituto Nacional de Estatística, Nova versão para o Infoline 2002, (in Portuguese), INEWS, Vol. 5.
Google Scholar
Jorge, A., Alves, M. A. and Azevedo, P. (2002). Recommendation with Association Rules: A web mining application. Proc. Information Society IS-2002: Data Mining and Warehouses, (eds. Mladenić D. and Grobelnik, M.), Ljubljana, Slovenia.
Google Scholar
Kohavi, R. (2001). The Good, the Bad and the Ugly. Proc. KDD 2001. ACM Press.
Google Scholar
Mena, J. (1999). Data Mining Your Website, Digital Press.
Google Scholar
Mladenić, D. (1999). Text-learning and related intelligent agents, In IEEE EXPERT, Special Issue on Applications of Intelligent Information Retrieval, July-August, 1999.
Google Scholar
Mobasher, B., Cooley, R. and Srivastava, R. (1999). Data Preparation for Mining World Wide Web Browsing Patterns, Journal of Knowledge and Information Systems, Vol. 1, No. 1, http://maya.cs.depaul.edu/~mobasher/pubs-subject.html/~mobasher/pubs-subject.html.
Mobasher, B., Dai, H., Luo, T. and Nakagawa, M. (2001). Improving the effectiveness of collaborative filtering on anonymous web usage data. Proc. IJCAI’s Seventh Workshop on Intelligent Techniques for Web Personalization. Seattle, Washington.
Google Scholar
Sarwar, B., Karypis, G., Konstan, J. and Riedl, J. (2000). Analysis of recommendation algorithms for e-commerce. Proc. ACM Conference on Electronic Commerce.
Google Scholar
Spiliopolou, M. and Pohle, C. (2001). Data Mining for Measuring and Improving the Success of Web Sites, Data mining and Knowledge Discovery, Vol. 5, 85–114.
Article Google Scholar
Steinbach, M., Karypis, G. and Kumar, V. (2000). A comparison of document clustering techniques. Proc. KDD Workshop on Text Mining, (eds. Grobelnik, M., Mladenić, D. and Milic-Frayling, N.), Boston, MA, USA, 109–110.
Google Scholar

Download references

Authors

Alípio Jorge
View author publications
You can also search for this author in PubMed Google Scholar
Mário A. Alves
View author publications
You can also search for this author in PubMed Google Scholar
Marko Grobelnik
View author publications
You can also search for this author in PubMed Google Scholar
Dunja Mladenić
View author publications
You can also search for this author in PubMed Google Scholar
Johann Petrak
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Intelligent Systems, JoŽef Stefan Institute, Ljubljana, Slovenia
Dunja Mladenić & Marko Bohanec (senior researcher) & (senior researcher)
JoŽef Stefan Institute, Ljubljana, Slovenia
Nada Lavrač (Head of the Intelligent Data Analysis and Computational Linguistic Research Group) (Head of the Intelligent Data Analysis and Computational Linguistic Research Group)
Oxford University Computing Laboratory, UK
Steve Moyle (researcher) (researcher)

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Jorge, A., Alves, M.A., Grobelnik, M., Mladenić, D., Petrak, J. (2003). Web Site Access Analysis for a National Statistical Agency. In: Mladenić, D., Lavrač, N., Bohanec, M., Moyle, S. (eds) Data Mining and Decision Support. The Springer International Series in Engineering and Computer Science, vol 745. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0286-9_14

Download citation

DOI: https://doi.org/10.1007/978-1-4615-0286-9_14
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-5004-0
Online ISBN: 978-1-4615-0286-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics