Abstract
In recent years, web usage mining techniques have helped online service providers to enhance their services, and restructure and redesign their websites in line with the insights gained. The application of these techniques is essential in building intelligent, personalised online services. More recently, it has been recognised that the shift from traditional to online services – and so the growing numbers of online customers and the increasing traffic generated by them – brings new challenges to the field. Highly demanding real-world E-commerce and E-services applications, where the rapid, and possibly changing, large volume data streams do not allow offline processing, motivate the development of new, highly efficient real-time web usage mining techniques. This chapter provides an introduction to online web usage mining and presents an overview of the latest developments. In addition, it outlines the major, and yet mostly unsolved, challenges in the field.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aggarwal, C.: Data Streams: Models and Algorithms. Advances in Database Systems. Springer, Heidelberg (2007)
Anand, S.S., Mobasher, B.: Intelligent techniques for web personalization. In: Mobasher, B., Anand, S.S. (eds.) ITWP 2003. LNCS (LNAI), vol. 3169, pp. 1–36. Springer, Heidelberg (2005)
Atterer, R., Wnuk, M., Schmidt, A.: Knowing the user’s every move: user activity tracking for website usability evaluation and implicit interaction. In: WWW 2006: Proceedings of the 15th international conference on World Wide Web, pp. 203–212. ACM, New York (2006)
Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: PODS 2002: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 1–16. ACM, New York (2002)
Baldi, P., Frasconi, P., Smyth, P.: Modeling the Internet and the Web: Probabilistic Methods and Algorithms. John Wiley & Sons, Chichester (2003)
Balog, K., Hofgesang, P.I., Kowalczyk, W.: Modeling navigation patterns of visitors of unstructured websites. In: AI-2005: Proceedings of the 25th SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence, pp. 116–129. Springer SBM, Heidelberg (2005)
Baraglia, R., Silvestri, F.: Dynamic personalization of web sites without user intervention. Commun. ACM 50(2), 63–67 (2007)
Barbará, D.: Requirements for clustering data streams. SIGKDD Explor. Newsl. 3(2), 23–27 (2002)
Baron, S., Spiliopoulou, M.: Monitoring the evolution of web usage patterns. In: Berendt, B., Hotho, A., Mladenič, D., van Someren, M., Spiliopoulou, M., Stumme, G. (eds.) EWMF 2003. LNCS (LNAI), vol. 3209, pp. 181–200. Springer, Heidelberg (2004)
Calders, T., Dexters, N., Goethals, B.: Mining frequent itemsets in a stream. In: Perner, P. (ed.) ICDM 2007, pp. 83–92. IEEE Computer Society, Los Alamitos (2007)
Chang, J.H., Lee, W.S.: EstWin: Online data stream mining of recent frequent itemsets by sliding window method. J. Inf. Sci. 31(2), 76–90 (2005)
Charikar, M., Chen, K., Farach-Colton, M.: Finding frequent items in data streams. In: Widmayer, P., Triguero, F., Morales, R., Hennessy, M., Eidenbenz, S., Conejo, R. (eds.) ICALP 2002. LNCS, vol. 2380, pp. 693–703. Springer, Heidelberg (2002)
Chen, C.-M.: Incremental personalized web page mining utilizing self-organizing HCMAC neural network. Web Intelli. and Agent Sys. 2(1), 21–38 (2004)
Chen, Y., Guo, J., Wang, Y., Xiong, Y., Zhu, Y.: Incremental mining of sequential patterns using prefix tree. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 433–440. Springer, Heidelberg (2007)
Cheng, H., Yan, X., Han, J.: IncSpan: incremental mining of sequential patterns in large database. In: KDD 2004: Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 527–532. ACM Press, New York (2004)
Cheung, W., Zaïane, O.R.: Incremental mining of frequent patterns without candidate generation or support constraint. In: IDEAS 2003: 7th International Database Engineering and Applications Symposium, pp. 111–116. IEEE Computer Society, Los Alamitos (2003)
Chi, Y., Wang, H., Yu, P.S., Muntz, R.R.: Moment: Maintaining closed frequent itemsets over a stream sliding window. In: ICDM 2004, pp. 59–66. IEEE Computer Society, Los Alamitos (2004)
Cooley, R., Mobasher, B., Srivastava, J.: Web mining: Information and pattern discovery on the world wide web. In: ICTAI 1997: Proceedings of the 9th International Conference on Tools with Artificial Intelligence, pp. 558–567. IEEE Computer Society, Los Alamitos (1997)
Cooley, R., Mobasher, B., Srivastava, J.: Data preparation for mining world wide web browsing patterns. Knowledge and Information Systems 1(1), 5–32 (1999)
Cormode, G., Muthukrishnan, S.: What’s hot and what’s not: tracking most frequent items dynamically. ACM Trans. Database Syst. 30(1), 249–278 (2005)
Desikan, P., Srivastava, J.: Mining temporally evolving graphs. In: Mobasher, B., Liu, B., Masand, B., Nasraoui, O. (eds.) WebKDD 2004: Webmining and Web Usage Analysis (2004)
Eirinaki, M., Vazirgiannis, M.: Web mining for web personalization. ACM Trans. Inter. Tech. 3(1), 1–27 (2003)
El-Sayed, M., Ruiz, C., Rundensteiner, E.A.: FS-Miner: efficient and incremental mining of frequent sequence patterns in web logs. In: WIDM 2004: Proceedings of the 6th annual ACM international workshop on Web information and data management, pp. 128–135. ACM Press, New York (2004)
Ester, M., Kriegel, H.-P., Sander, J., Wimmer, M., Xu, X.: Incremental clustering for mining in a data warehousing environment. In: Gupta, A., Shmueli, O., Widom, J. (eds.) VLDB 1998: Proceedings of 24rd International Conference on Very Large Data Bases, pp. 323–333. Morgan Kaufmann, San Francisco (1998)
Fetterly, D., Manasse, M., Najork, M., Wiener, J.L.: A large-scale study of the evolution of web pages. Softw. Pract. Exper. 34(2), 213–237 (2004)
Gaber, M.M., Zaslavsky, A., Krishnaswamy, S.: Mining data streams: a review. SIGMOD Rec. 34(2), 18–26 (2005)
Gama, J., Castillo, G.: Learning with local drift detection. In: Li, X., Zaïane, O.R., Li, Z. (eds.) ADMA 2006. LNCS (LNAI), vol. 4093, pp. 42–55. Springer, Heidelberg (2006)
Ganti, V., Gehrke, J., Ramakrishnan, R.: DEMON: Mining and monitoring evolving data. Knowledge and Data Engineering 13(1), 50–63 (2001)
Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.: Mining Frequent Patterns in Data Streams at Multiple Time Granularities. In: Kargupta, H., Joshi, A., Sivakumar, K., Yesha, Y. (eds.) Next Generation Data Mining. AAAI/MIT (2003)
Giraud-Carrier, C.: A note on the utility of incremental learning. AI Communications 13(4), 215–223 (2000)
Godoy, D., Amandi, A.: User profiling for web page filtering. IEEE Internet Computing 9(04), 56–64 (2005)
Gündüz-Ögüdücü, S., Özsu, M.T.: Incremental click-stream tree model: Learning from new users for web page prediction. Distributed and Parallel Databases 19(1), 5–27 (2006)
Han, J., Han, D., Lin, C., Zeng, H.-J., Chen, Z., Yu, Y.: Homepage live: automatic block tracing for web personalization. In: WWW 2007: Proceedings of the 16th International Conference on World Wide Web, pp. 1–10. ACM, New York (2007)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Chen, W., Naughton, J.F., Bernstein, P.A. (eds.) Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA, May 16-18, pp. 1–12. ACM, New York (2000)
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Min. Knowl. Discov. 8(1), 53–87 (2004)
Hofgesang, P.I.: Methodology for preprocessing and evaluating the time spent on web pages. In: WI 2006: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 218–225. IEEE Computer Society, Los Alamitos (2006)
Hofgesang, P.I.: Web personalisation through incremental individual profiling and support-based user segmentation. In: WI 2007: Proceedings of the 2007 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 213–220. IEEE Computer Society, Washington (2007)
Hofgesang, P.I., Patist, J.P.: Online change detection in individual web user behaviour. In: WWW 2008: Proceedings of the 17th International Conference on World Wide Web, pp. 1157–1158. ACM, New York (2008)
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106. ACM Press, New York (2001)
Jin, C., Qian, W., Sha, C., Yu, J.X., Zhou, A.: Dynamically maintaining frequent items over a data stream. In: CIKM 2003: Proceedings of the twelfth international conference on Information and knowledge management, pp. 287–294. ACM, New York (2003)
Xie, Z.-j., Chen, H., Li, C.: MFIS—mining frequent itemsets on data streams. In: Li, X., Zaïane, O.R., Li, Z. (eds.) ADMA 2006. LNCS, vol. 4093, pp. 1085–1093. Springer, Heidelberg (2006)
Khoury, I., El-Mawas, R.M., El-Rawas, O., Mounayar, E.F., Artail, H.: An efficient web page change detection system based on an optimized Hungarian algorithm. IEEE Trans. Knowl. Data Eng. 19(5), 599–613 (2007)
Koh, J.-L., Shieh, S.-F.: An efficient approach for maintaining association rules based on adjusting FP-tree structures1. In: Lee, Y., Li, J., Whang, K.-Y., Lee, D. (eds.) DASFAA 2004. LNCS, vol. 2973, pp. 417–424. Springer, Heidelberg (2004)
Laxman, S., Sastry, P.S., Unnikrishnan, K.P.: A fast algorithm for finding frequent episodes in event streams. In: KDD 2007: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 410–419. ACM, New York (2007)
Lee, D., Lee, W.: Finding maximal frequent itemsets over online data streams adaptively. In: ICDM 2005: Proceedings of the 5th IEEE International Conference on Data Mining, pp. 266–273. IEEE Computer Society, Los Alamitos (2005)
Leung, C.K.-S., Khan, Q.I.: DSTree: A tree structure for the mining of frequent sets from data streams. In: Perner, P. (ed.) ICDM 2006: Proceedings of the Sixth International Conference on Data Mining, pp. 928–932. IEEE Computer Society, Los Alamitos (2006)
Leung, C.K.-S., Khan, Q.I., Hoque, T.: CanTree: A tree structure for efficient incremental mining of frequent patterns. In: ICDM 2005: Proceedings of the 5th IEEE International Conference on Data Mining, pp. 274–281. IEEE Computer Society, Los Alamitos (2005)
Li, H.-F., Lee, S.-Y., Shan, M.-K.: On mining webclick streams for path traversal patterns. In: WWW Alt. 2004: Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters, pp. 404–405. ACM, New York (2004)
Li, H.-F., Lee, S.-Y., Shan, M.-K.: DSM-TKP: Mining top-k path traversal patterns over web click-streams. In: WI 2005: Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 326–329. IEEE Computer Society, Los Alamitos (2005)
Li, H.-F., Lee, S.-Y., Shan, M.-K.: DSM-PLW: single-pass mining of path traversal patterns over streaming web click-sequences. Comput. Netw. 50(10), 1474–1487 (2006)
Liu, B.: Web Data Mining. Springer, Heidelberg (2007)
Liu, L., Pu, C., Tang, W.: WebCQ-detecting and delivering information changes on the web. In: CIKM 2000: Proceedings of the ninth international conference on Information and knowledge management, pp. 512–519. ACM Press, New York (2000)
Masseglia, F., Poncelet, P., Teisseire, M.: Web usage mining: How to efficiently manage new transactions and new clients. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS, vol. 1910, pp. 530–535. Springer, Heidelberg (2000)
Mobasher, B., Dai, H., Luo, T., Sun, Y., Zhu, J.: Integrating web usage and content mining for more effective personalization. In: Bauknecht, K., Madria, S.K., Pernul, G. (eds.) EC-Web 2000. LNCS, vol. 1875, pp. 165–176. Springer, Heidelberg (2000)
Nasraoui, O., Cerwinske, J., Rojas, C., González, F.A.: Performance of recommendation systems in dynamic streaming environments. In: SDM 2007. SIAM, Philadelphia (2007)
Nasraoui, O., Rojas, C., Cardona, C.: A framework for mining evolving trends in web data streams using dynamic learning and retrospective validation. Computer Networks 50(10), 1488–1512 (2006)
Nasraoui, O., Soliman, M., Saka, E., Badia, A., Germain, R.: A web usage mining framework for mining evolving user profiles in dynamic web sites. IEEE Trans. Knowl. Data Eng. 20(2), 202–215 (2008)
Nasraoui, O., Uribe, C.C., Coronel, C.R., González, F.A.: TECNO-STREAMS: Tracking evolving clusters in noisy data streams with a scalable immune system learning model. In: ICDM 2003: Proceedings of the 3rd IEEE International Conference on Data Mining, pp. 235–242. IEEE Computer Society, Los Alamitos (2003)
Nguyen, S.N., Sun, X., Orlowska, M.E.: Improvements of incSpan: Incremental mining of sequential patterns in large database. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS, vol. 3518, pp. 442–451. Springer, Heidelberg (2005)
Ntoulas, A., Cho, J., Olston, C.: What’s new on the web?: the evolution of the web from a search engine perspective. In: WWW 2004: Proceedings of the 13th international conference on World Wide Web, pp. 1–12. ACM, New York (2004)
Parthasarathy, S., Zaki, M.J., Ogihara, M., Dwarkadas, S.: Incremental and interactive sequence mining. In: CIKM 1999: Proceedings of the eighth international conference on Information and knowledge management, pp. 251–258. ACM Press, New York (1999)
Perkowitz, M., Etzioni, O.: Adaptive web sites: automatically synthesizing web pages. In: AAAI 1998/IAAI 1998: Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence, pp. 727–732. American Association for Artificial Intelligence, Menlo Park (1998)
Pierrakos, D., Paliouras, G., Papatheodorou, C., Spyropoulos, C.D.: Web usage mining as a tool for personalization: A survey. User Modeling and User-Adapted Interaction 13(4), 311–372 (2003)
Roddick, J.F., Spiliopoulou, M.: A survey of temporal knowledge discovery paradigms and methods. IEEE Transactions on Knowledge and Data Engineering 14(4), 750–767 (2002)
Rojas, C., Nasraoui, O.: Summarizing evolving data streams using dynamic prefix trees. In: WI 2007: Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence, pp. 221–227. IEEE Computer Society, Washington (2007)
Spiliopoulou, M., Ntoutsi, I., Theodoridis, Y., Schult, R.: MONIC: modeling and monitoring cluster transitions. In: Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 706–711. ACM, New York (2006)
Srivastava, J., Cooley, R., Deshpande, M., Tan, P.-N.: Web usage mining: Discovery and applications of usage patterns from web data. SIGKDD Explorations 1(2), 12–23 (2000)
Stonebraker, M., Çetintemel, U., Zdonik, S.: The 8 requirements of real-time stream processing. SIGMOD Rec. 34(4), 42–47 (2005)
Suryavanshi, B.S., Shiri, N., Mudur, S.P.: Adaptive web usage profiling. In: Nasraoui, O., Zaïane, O.R., Spiliopoulou, M., Mobasher, B., Masand, B., Yu, P.S. (eds.) WebKDD 2005. LNCS, vol. 4198, pp. 119–138. Springer, Heidelberg (2006)
Wang, K.: Discovering patterns from large and dynamic sequential data. J. Intell. Inf. Syst. 9(1), 33–56 (1997)
Weinreich, H., Obendorf, H., Herder, E., Mayer, M.: Not quite the average: An empirical study of web use. ACM Trans. Web 2(1), 1–31 (2008)
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Machine Learning 23(1), 69–101 (1996)
Wu, E.H., Ng, M.K., Huang, J.Z.: On improving website connectivity by using web-log data streams. In: Lee, Y., Li, J., Whang, K.-Y., Lee, D. (eds.) DASFAA 2004. LNCS, vol. 2973, pp. 352–364. Springer, Heidelberg (2004)
Wu, E.H., Ng, M.K., Yip, A.M., Chan, T.F.: A clustering model for mining evolving web user patterns in data stream environment. In: Yang, Z.R., Yin, H., Everson, R.M. (eds.) IDEAL 2004. LNCS, vol. 3177, pp. 565–571. Springer, Heidelberg (2004)
Yen, S.-J., Lee, Y.-S., Hsieh, M.-C.: An efficient incremental algorithm for mining web traversal patterns. In: ICEBE 2005: Proceedings of the IEEE International Conference on e-Business Engineering, pp. 274–281. IEEE Computer Society, Los Alamitos (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Hofgesang, P.I. (2009). Online Mining of Web Usage Data: An Overview. In: Ting, IH., Wu, HJ. (eds) Web Mining Applications in E-commerce and E-services. Studies in Computational Intelligence, vol 172. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88081-3_1
Download citation
DOI: https://doi.org/10.1007/978-3-540-88081-3_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88080-6
Online ISBN: 978-3-540-88081-3
eBook Packages: EngineeringEngineering (R0)