Abstract
Discovery of usage patterns from Web data is one of the primary purposes for Web Usage Mining. In this paper, a technique to generate Significant Usage Patterns (SUP) is proposed and used to acquire significant “user preferred navigational trails”. The technique uses pipelined processing phases including sub-abstraction of sessionized Web clickstreams, clustering of the abstracted Web sessions, concept-based abstraction of the clustered sessions, and SUP generation. Using this technique, valuable customer behavior information can be extracted by Web site practitioners. Experiments conducted using Web log data provided by J.C.Penney demonstrate that SUPs of different types of customers are distinguishable and interpretable. This technique is particularly suited for analysis of dynamic websites.
This work is supported by the National Science Foundation under Grant No. IIS-0208741.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: Proc. 11 Intl. Conf. on Data Engineering, Taipi, Taiwan (March 1995)
Buchner, A.G., Baumgarten, M., Anand, S.S., Mulvenna, M.D., Hughes, J.G.: Navigation Pattern Discovery From Internet Data. In: Workshop on Web Usage Analysis and User Profiling (August 1999)
Berkhin, P.: Survey Of Clustering Data Mining Techniques. Accrue Software, Technical Report (2002)
Banerjee, A., Ghosh, J.: Clickstream Clustering using Weighted Longest Common Subsequences. In: Proc. of the Workshop on Web Mining, SIAM Conference on Data Mining, pp. 33–40. Chicago IL (April 2001)
Borges, J., Levene, M.: Data Mining of User Navigation Patterns. In: Masand, B., Spiliopoulou, M. (eds.) WebKDD 1999. LNCS (LNAI), vol. 1836, pp. 92–112. Springer, Heidelberg (2000)
Borges, J., Levene, M.: An average linear time algorithm for web data mining. International Journal of Information Technology and Decision Making 3, 307–320 (2004)
Cadez, I.V., Heckerman, D., Meek, C., Smyth, P., White, S.: Visualization of Navigation Patterns on a Web Site Using Model Based Clustering. In: Proc. of 6th ACM SIGKDD Int’l. Conf. on Knowledge Discovery and Data Mining (2000)
Cooley, R., Mobasher, B., Srivastava, J.: Data preparation for mining world wide web browsing patterns. Knowledge and Information Systems 1(1), 5–32 (1999)
Chen, M.-S., Park, J.S., Yu, P.S.: Efficient Data Mining for Path Traversal Patterns. IEEE Transactions on Knowledge and Data Engineering 10(2), 209–221 (1998)
Dunham, M.H.: Data Mining Introductory and Advanced Topics. Prentice-Hall, Englewood Cliffs (2003)
Fu, Y., Sandhu, K., Shih, M.: Clustering of web users based on access patterns. In: Masand, B., Spiliopoulou, M. (eds.) WebKDD 1999. LNCS (LNAI), vol. 1836. Springer, Heidelberg (2000)
Foss, A., Wang, W., Zaïane, O.R.: A non-parametric approach to web log analysis. In: Proc. of Workshop on Web Mining in First International SIAM Conference on Data Mining, Chicago, April 2001, pp. 41–50 (2001)
Guha, S., Rastogi, R., Shim, K.: ROCK: a robust clustering algorithm for categorical attributes. In: ICDE (1999)
Gündüz, Ş., Özsu, M.T.: A Web page prediction model based on click-stream tree representation of user behavior. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, Washington, D.C, August 24-27 (2003)
Hair, J.F., Andersen, R.E., Tatham, R.L., Black, W.C.: Multivariate Data Analysis. Prentice Hall, New Jersey (1998)
Hay, B., Wets, G., Vanhoof, K.: Clustering Navigation Patterns on a Website Using a Sequence Alignment Method. In: IJCAI’s Workshop on Intelligent Techniques for Web Personalization (2001)
Karypis, G., Han, E.-H., Kumar, V.: Chameleon: A hierarchical clustering algorithm using dynamic modeling. IEEE Computer 32(8), 68–75 (1999)
Moe, W.W.: Buying, Searching, or Browsing: Differentiating between Online Shoppers Using In-Store Navigational Clickstream. Journal of Consumer Psychology 13(1&2), 29–40 (2003)
Pei, J., Han, J., Mortazavi-Asl, B., Zhu, H.: Mining Access Patterns Efficiently From Web Logs. In: Proc. of Pacific Asia Conf. on Knowledge Discovery and Data Mining, Kyoto, Japan, p. 592 (April 2000)
Srivastava, J., Cooley, R., Deshpande, M., Tan, P.: Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data. SIGKDD Explorations 1(2), 12–23 (2000)
Setubal, Meidanis: Introduction to Computational Molecular Biology. PWS Publishing Company (1997)
Spiliopoulou, M., Pohle, C., Teltzrow, M.: Modelling Web Site Usage with Sequences of Goal-Oriented Tasks. In: Multi-Konferenz Wirtschaftsinformatik 2002 vom 9.-11, Nürnberg (September 2002)
Wang, W., Zaïane, O.R.: Clustering Web Sessions by Sequence Alignment. In: Third International Workshop on Management of Information on the Web in conjunction with 13th International Conference on Database and Expert Systems Applications DEXA 2002, Aix en Provence, France, September 2-6, pp. 394–398 (2002)
Xiao, Y.-Q., Dunham, M.H.: Efficient mining of traversal patterns. Data and Knowledge Engineering 39(2), 191–214 (2001)
Nasraoui, O., Frigui, H., Joshi, A., Krishnapuram, R.: Mining Web Access Logs Using Relational Competitive Fuzzy Clustering. In: Proceedings of the Eighth International Fuzzy Systems Association Congress, Hsinchu, Taiwan (August 1999)
Nasraoui, O., Frigui, H., Krishnapuram, R., Joshi, A.: Extracting Web User Profiles Using Relational Competitive Fuzzy Clustering. International Journal on Artificial Intelligence Tools 9(4), 509–526 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lu, L., Dunham, M., Meng, Y. (2006). Mining Significant Usage Patterns from Clickstream Data. In: Nasraoui, O., Zaïane, O., Spiliopoulou, M., Mobasher, B., Masand, B., Yu, P.S. (eds) Advances in Web Mining and Web Usage Analysis. WebKDD 2005. Lecture Notes in Computer Science(), vol 4198. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11891321_1
Download citation
DOI: https://doi.org/10.1007/11891321_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46346-7
Online ISBN: 978-3-540-46348-1
eBook Packages: Computer ScienceComputer Science (R0)