Advertisement

Finding Generalized Path Patterns for Web Log Data Mining

  • Alex Nanopoulos
  • Yannis Manolopoulos
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1884)

Abstract

Conducting data mining on logs of web servers involves the determination of frequently occurring access sequences. We examine the problem of finding traversal patterns from web logs by considering the fact that irrelevant accesses to web documents may be interleaved within access patterns due to navigational purposes. We define a general type of pattern that takes into account this fact and also, we present a level-wise algorithm for the determination of these patterns, which is based on the underlying structure of the web site. The performance of the algorithm and its sensitivity to several parameters is examined experimentally with synthetic data.

Keywords

Association Rule Mining Association Rule Adjacency List Corruption Level Candidate Path 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    R. Agrawal and R. Srikant: “Fast Algorithms for Mining Association Rules”, Proceedings Very Large Data Bases Conference (VLDB’94), pp. 487–499, 1994.Google Scholar
  2. 2.
    R. Agrawal and R. Srikant: “Mining Sequential Patterns”, Proceedings International Conference on Data Engineering (ICDE’95), pp. 3–14, 1995.Google Scholar
  3. 3.
    M. Arlitt and C. Williamson. “Internet Web Servers: Workload Characterization and Performance”, IEEE/ACM Transactions on Networking, Vol. 5, No. 5, 1997.Google Scholar
  4. 4.
    P. Barford and M. Crovell: “Generating Representative Web Workloads for Network and Server Performance Evaluation”, Proceedings ACM Conference on Mea surement and Modeling of Computer Systems (SIGMETRICS’98), pp. 151–160, 1998.Google Scholar
  5. 5.
    J. Borges and M. Levene: “Mining Association Rules in Hypertext Databases”, Proceedings Conference on Knowledge Discovery and Data Mining (KDD’98), pp. 149–153, 1998.Google Scholar
  6. 6.
    S. Brin, R. Motwani, J. Ullman and S. Tsur: “Dynamic Itemset Counting and Implication Rules for Market Basket Data”, Proceedings ACM SIGMOD Conference (SIGMOD’97), pp. 255–264, 1997.Google Scholar
  7. 7.
    M.S. Chen, J.S. Park and P.S. Yu: “Efficient Data Mining for Path Traversal Patterns”, IEEE Transactions on Knowledge and Data Engineering, Vol. 10, No. 2, pp. 209–221, 1998.CrossRefGoogle Scholar
  8. 8.
    Y. Chiang, M. Goodrich, E. Grove, R. Tamassia, D. Vengroff and J.S. Vitter: “External-Memory Graph Algorithms”, Proceedings Symposium on Discrete Algorithms (SODA’95), pp. 139–149, 1995.Google Scholar
  9. 9.
    R. Cooley, B. Mobasher and J. Srivastava: “Data Preparation for Mining World Wide Web Browsing Patterns”, Knowledge and Information Systems, Vol. 1, No. 1, pp. 5–32, 1999.Google Scholar
  10. 10.
    K. Joshi, A. Joshi, Y. Yesha and R. Krishnapuram: “Warehousing and Mining Web Logs”, Proceedings Workshop on Web Information and Data Management, pp. 63–68, 1999.Google Scholar
  11. 11.
    M. Nodine, M. Goodrich and J.S. Vitter: “Blocking for External Graph Searching”, Proceedings ACM PODS Conference (PODS’93), pp. 222–232, 1993.Google Scholar
  12. 12.
    A. Nanopoulos and Y. Manolopoulos: “Finding Generalized Path Patterns for Web Log Data Mining”, Technical report, Aristotle University, http://delab.csd.auth.gr/publications.html, 2000.
  13. 13.
    J.S. Park, M.S. Chen and P.S. Yu: “Using a Hash-based Method with Transaction Trimming for Mining Association Rules”, IEEE Transactions on Knowledge and Data Engineering, Vol. 9, No. 5, pp. 813–825, 1997.CrossRefGoogle Scholar
  14. 14.
    J. Pei, J. Han, B. Mortazavi-Asl and H. Zhu: “Mining Access Patterns Efficiently from Web Logs”, Proceedings Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’00), 2000.Google Scholar
  15. 15.
    Y. Xiao and M. Dunham: “Considering Main Memory in Mining Association Rules”, Proceedings Conference on Data Warehousing and Knowledge Discovery (Da-WaK’99), pp. 209–218, 1999.Google Scholar
  16. 16.
    O. Zaiane, M. Xin and J. Han: “Discovering Web Access Patterns and Trends by Applying OLAP and Data Mining Technology on Web Logs”, Proceedings on Advances in Digital Libraries (ADL’98), pp. 19–29, 1998.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Alex Nanopoulos
    • 1
  • Yannis Manolopoulos
    • 1
  1. 1.Data Engineering Lab, Department of InformaticsAristotle UniversityThessalonikiGreece

Personalised recommendations