Skip to main content

Part of the book series: Advances in Database Systems ((ADBS,volume 27))

  • 69 Accesses

Abstract

Web access-logs record the access history of users that visit a Web site. The entries of the log are collected automatically and, for this reason, their size tends to grow very rapidly. Recent work has proposed the application of web-log mining methods[4, 7, 12, 15, 16, 17], which search for access-patterns. Some examples include methods based on clustering [20] and sequence mining [2].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. Agrawal, R. Srikant. “Mining Sequential Patterns”. Proceedings 17th IEEE International Conference on Data Engineering (ICDE’2001), pp.3–14, Taipei, Taiwan, 1995.

    Google Scholar 

  2. M.S. Chen, J.S. Park, P.S. Yu. “Efficient Data Mining for Path Traversal Patterns”. IEEE Transactions on Knowledge and Data Engineering, Vol.10, No.2, pp.209–221, 1998.

    Article  Google Scholar 

  3. S. Christodoulakis and C. Faloutsos. “Signature Files: an Access Method for Documents and its Analytical Performance Evaluation, ACM Transactions on Office Information Systems, Vol.2, No.4, pp.267–288, 1984.

    Article  Google Scholar 

  4. R. Cooley, B. Mobasher, J. Srivastava. “Data Preparation for Mining World Wide Web Browsing Patterns”. Knowledge and Information Systems, Vol.1, No.1, pp.5–32, 1999.

    Google Scholar 

  5. S. Helmer, G. Moerkotte. “A Study of Four Index Structures for Set-Valued Attributes of Low Cardinality”. Technical Report, Reihe Informatik 2/1999, University of Mannheim, p.20, 1999.

    Google Scholar 

  6. M. Kitagawa, Y. Ishikawa, N. Obho. “Evaluation of Signature Files as Set Access Facility in OODBs”. Proceedings ACM International Conference on Man-agement of Data (SIGMOD’93), pp.247–256, Washington, DC, 1993.

    Google Scholar 

  7. M. Garofalakis, R. Rastogi, S. Seshadri, K. Shim. “Data Mining and the Web: Past, Present and Future”. Proceedings 2nddWorkshop on Web Information & Data Management (WIDM’99), pp.43–47, Kansas City, Missouri, 1999.

    Google Scholar 

  8. T. Imielinski, A. Virmani. “MSQL: A Query Language for Database Mining”. Data Mining and Knowledge Discovery, Vol.3, No.4, pp.373–408, 1999.

    Article  Google Scholar 

  9. T. Morzy, M. Zakrzewicz. “Group Bitmap Index: a Structure for Association Rules Retrieval”. Proceedings 4th International Conference on Knowledge Dis-covery in Databases and Data Mining (KDD’98), pp.284–288, New York, NY, 1998.

    Google Scholar 

  10. T. Morzy, M. Wojciechowski, M. Zakrzewicz. “Optimizing Pattern Queries for Web Access Logs”. Proceedings 5th East-European Conference on Advances in Databases and Information Systems (ADBIS’2001), pp. 141–154, Vilnius, Lithouania, 2001.

    Google Scholar 

  11. A. Nanopoulos, D. Katsaros, Y. Manolopoulos. “A Data Mining Algorithm for Generalized Web Prefetching”. IEEE Transactions on Knowledge and Data Engineering, in print, 2003.

    Google Scholar 

  12. A. Nanopoulos, Y. Manolopoulos. “Finding Generalized Path Patterns for Web Log Data Mining”. Proceedings 4th East-European Conference on Advances in Databases and Information Systems (ADBIS-DASFAA’2000), pp.215–228, Prague, Czech Republic, 2000.

    Google Scholar 

  13. A. Nanopoulos, M. Zakrzewicz, T. Morzy, Y. Manolopoulos. “Efficient Storage and Querying of Sequential Patterns in Database Systems”. Information and Software Technology, Vol.45, No.1, pp.23–34, 2003.

    Article  Google Scholar 

  14. A. Nanopoulos, M. Zakrzewicz, T. Morzy, Y. Manolopoulos. “Indexing Web Access-Logs for Pattern Queries”. Proceedings 4th International Workshop on Web Information & Data Management (WIDM’2002), pp.398–404, Washington, DC, 2002.

    Google Scholar 

  15. J. Pei, J. Han, B. Mortazavi-Asl, H. Zhu. “Mining Access Patterns Efficiently from Web Logs”. Proceedings 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’2000), Kyoto, Japan, 2000.

    Google Scholar 

  16. M. Perkowitz, O. Etzioni. “Adaptive Web Sites; an AI Challenge”. Proceedings 15th International Joint Conference on Artificial Intelligence (IJCAI’97), Nagoya, Japan, 1997.

    Google Scholar 

  17. J. Pitkow. “In Search of Reliable Usage Data on the WWW”. Proceedings 6th International WWW Conference, Santa Clara, CA, 1997.

    Google Scholar 

  18. M. Spiliopoulou, L. Faulstich. “WUM-a Tool for WWW Ulitization Analysis”. Proceedings 1st International Workshop on WWW and Databases (WebDB’98), pp. 184–103, Valencia, Spain, 1998.

    Google Scholar 

  19. E. Tousidou, A. Nanopoulos, Y. Manolopoulos. “Improved Methods for Signature-Tree Construction”. The Computer Journal, Vol.43, No.4, pp.301–314, 2000.

    Article  MATH  Google Scholar 

  20. T.W. Yan, M. Jacobsen, H. Garcia-Molina and U. Dayal. “From User Access Patterns to Dynamic Hypertext Linking”, Computer Networks and ISDN Systems, Vol.28, No.7-11, pp.1007–1014, May 1996.

    Article  Google Scholar 

  21. M. Zakrzewicz. “Sequential Index Structure for Content-Based Retrieval”. Proceeding 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’2001), pp.306–311, Hong Kong, China, 2001.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer Science+Business Media New York

About this chapter

Cite this chapter

Manolopoulos, Y., Nanopoulos, A., Tousidou, E. (2003). Storage and Querying of Large Web-Logs. In: Advanced Signature Indexing for Multimedia and Web Applications. Advances in Database Systems, vol 27. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-8636-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-1-4419-8636-8_8

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-4654-8

  • Online ISBN: 978-1-4419-8636-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics