Skip to main content

Frequent Generalized Subsequences — A Problem From Web Mining

  • Chapter
Data Analysis

Abstract

The everlasting growth of the web in terms of, e.g., amount of information, size of the net, and number of users is demanding for tools that help to tackle content (re)structuring, discover navigation patterns of users, support marketing activities of sellers (e.g., advertising and cross selling), and attract potential customers in this new environment. This paper describes how user navigation paths can be extracted from raw web logfiles and how frequent subsequences can be discovered in those paths. To better cope with navigational behaviour in the large, generalized sequences containing wildcards are used and a new algorithm for mining all frequent generalized subsequences from a given database is presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • AGRAWAL, R. and SRIKANT, R. (1994): Fast Algorithms for Mining Association Rules. In: Bocca, J.B., Jarke, M., and Zaniolo, C. (eds.): Proceedings of the 20th International Conference on Very Large Data Bases (VLDB’94), September 12–15, 1994, Santiago de Chile, Morgan Kaufmann, Chile, 487–499.

    Google Scholar 

  • AGRAWAL, R. and SRIKANT, R. (1995): Mining Sequential Patterns. In: Yu, P.S., and Chen, A.L.P. (eds.): Proceedings of the Eleventh International Conference on Data Engineering, March 6–10, 1995, Taipei, Taiwan, IEEE Computer Society, 3–14.

    Google Scholar 

  • Bock, H.H. (1974): Automatische Klassifikation, Theoretische und praktische Methoden zur Gruppierung und Strukturierung von Daten (Clusteranalyse), Vandenhoeck Ruprecht, Göttingen.

    Google Scholar 

  • BORGES, J. and LEVENE, M. (1998): Mining Association Rules in Hypertext Databases. In: Agrawal, R. (ed.): Proceedings/The Fourth International Conference on Knowledge Discovery and Data Mining, August 27–31, 1998, New York, Menlo Park, Calif., 149–153.

    Google Scholar 

  • BORGES, J. and LEVENE, M. (1999a): Mining Navigation Patterns with Hypertext Probabilistic Grammars. Research Note RN/99/08, Department of Computer Science, University College London, February 1999.

    Google Scholar 

  • BORGES, J. and LEVENE, M. (1999b): Data Mining of User Navigation Patterns. In: Proceedings of the Workshop on Web Usage Analysis and User Profiling (WEBKDD’99), August 15, 1999, San Diego, CA, Springer, 31–36.

    Google Scholar 

  • BUECHNER, A.G., BAUMGARTEN, M., ANAND, S.S., MULVENNA, M.D., and HUGHES, J.G. (1999): Navigation Pattern Discovery from Internet Data. In: Proceedings of the Workshop on Web Usage Analysis and User Profiling (WEBKDD’99), August 15, 1999, San Diego, CA, Springer, 25–30.

    Google Scholar 

  • CHEN, M.-S., PARK, J.S., and YU, P.S. (1996): Data Mining for Path Traversal Patterns in a Web Environment. In: Proceedings of the 16th International Conference on Distributed Computing Systems (ICDCS), May 27–30, 1996, Hong Kong, IEEE Computer Society, 385–392.

    Google Scholar 

  • CHEN, M.-S., PARK, J.S., and YU, P.S. (1998): Efficient Data Mining for Path Traversal Patterns. IEEE Transactions on Knowledge I? Data Engineering 10/2 (1998), 209–221.

    Google Scholar 

  • COOLEY, R., MOBASHER, B., and SRIVASTAVA, J. (1999a): Web Mining: Information and Pattern Discovery on the World Wide Web. In: 9th International Conference on Tools with Artificial Intelligence (ICTAI ‘87), November 3–8, 1997, Newport Beach, CA.

    Google Scholar 

  • COOLEY, R., MOBASHER, B., and SRIVASTAVA, J. (1999b): Data Preparation for Mining World Wide Web Browsing Patterns. Journal of Knowledge and Information Systems 1/1 (1999).

    Google Scholar 

  • GAUL, W. and SCHADER, M. (1999): Data Mining: A New Label for an Old Problem? in: Gaul, W. and Schader, M. (Hrsg.): Mathematische Methoden der Wirtschaftswissenschaft, Festschrift für Otto Opitz, Physica-Verlag, Heidelberg, 3–14.

    Google Scholar 

  • MOBASHER, B. (2000): Mining Web Usage Data for Automatic Site Personalization. To appear in Studies in Classification, Data Analysis, and Knowledge Organization, 2000.

    Google Scholar 

  • SHAHABI, C., ZARKESH, A.M., ADIBI, J., and SHAH, V. (1997): Knowledge Discovery from Users Web-Page Navigation. In: 7th International Workshop on Research Issues in Data Engineering (RIDE ‘87), High Performance Database Management for Large-Scale Applications, April 7–8, 1997, Birmingham, UK.

    Google Scholar 

  • SPILIOPOULOU, M. and FAULSTICH, L.C. (1998): WUM: A Tool for Web Utiliziation Analysis. In: Atzeni, P., Mendelzon, A., and Mecca, G. (eds.): The World Wide Web and Databases, International Workshop WebDB’98, Valencia, Spain, March 27–28, 1998, LNCS 1590, Springer, 184–203.

    Google Scholar 

  • SPILIOPOULOU, M., FAULSTICH, L.C., and WINKLER, K. (1999): A Data Miner Analyzing the Navigational Behaviour of Web Users. In: Proc. of the Workshop on Machine Learning in User Modelling of the ACAI’99 Int. Conf., Greta, Greece, July 1999.

    Google Scholar 

  • SPILIOPOULOU, M. (1999): The Laborious Way from Data Mining to Web Mining. Int. Journal of Comp. Sys., Sci. h Eng. 14 (1999), Special Issue on “Semantics of the Web”, 113–126.

    Google Scholar 

  • SRIKANT, R. and AGRAWAL, R. (1996): Mining Sequential Patterns: Generalizations and Performance Improvements. In: Apers, P.M.G., Bouzeghoub, M., and Gardarin, G. (eds.): Advances in Database Technology - EDBT’96, 5th International Conference on Extending Database Technology, Avignon, France, March 25–29, 1996, Proceedings. LNCS 1057, Springer.

    Google Scholar 

  • VIVEROS, M.S., ELO-DEAN, S., WRIGHT, M.A., and DURI, S.S. (1997): Visitor’s Behaviour: Mining Web Servers. In: Proceedings of the 1st International Conference on the Practical Application of Knowledge Discovery and Data Mining, Blackpool 1997, 257–269.

    Google Scholar 

  • YAN, T.W., JACOBSEN, M., GARCIA-MOLINA, H., and DAYAL, U. (1996): From User Access Patterns to Dynamic Hypertext Linking. In: Fifth International World Wide Web Conference May 6–10, 1996, Paris, France.

    Google Scholar 

  • ZAIANE, O.R., XIN, M., and HAN, J. (1998): Discovering Web Access Patterns and Trends by Applying OLAP and Data Mining Technology on Web Logs. In: Advances in Digital Libraries, Santa Barbara 1998, 19–29.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin · Heidelberg

About this chapter

Cite this chapter

Gaul, W., Schmidt-Thieme, L. (2000). Frequent Generalized Subsequences — A Problem From Web Mining. In: Gaul, W., Opitz, O., Schader, M. (eds) Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-58250-9_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-58250-9_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-67731-4

  • Online ISBN: 978-3-642-58250-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics