Frequent Generalized Subsequences — A Problem From Web Mining

Gaul, Wolfgang; Schmidt-Thieme, Lars

doi:10.1007/978-3-642-58250-9_35

Wolfgang Gaul⁷ &
Lars Schmidt-Thieme⁷

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

978 Accesses
6 Citations

Abstract

The everlasting growth of the web in terms of, e.g., amount of information, size of the net, and number of users is demanding for tools that help to tackle content (re)structuring, discover navigation patterns of users, support marketing activities of sellers (e.g., advertising and cross selling), and attract potential customers in this new environment. This paper describes how user navigation paths can be extracted from raw web logfiles and how frequent subsequences can be discovered in those paths. To better cope with navigational behaviour in the large, generalized sequences containing wildcards are used and a new algorithm for mining all frequent generalized subsequences from a given database is presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

AGRAWAL, R. and SRIKANT, R. (1994): Fast Algorithms for Mining Association Rules. In: Bocca, J.B., Jarke, M., and Zaniolo, C. (eds.): Proceedings of the 20th International Conference on Very Large Data Bases (VLDB’94), September 12–15, 1994, Santiago de Chile, Morgan Kaufmann, Chile, 487–499.
Google Scholar
AGRAWAL, R. and SRIKANT, R. (1995): Mining Sequential Patterns. In: Yu, P.S., and Chen, A.L.P. (eds.): Proceedings of the Eleventh International Conference on Data Engineering, March 6–10, 1995, Taipei, Taiwan, IEEE Computer Society, 3–14.
Google Scholar
Bock, H.H. (1974): Automatische Klassifikation, Theoretische und praktische Methoden zur Gruppierung und Strukturierung von Daten (Clusteranalyse), Vandenhoeck Ruprecht, Göttingen.
Google Scholar
BORGES, J. and LEVENE, M. (1998): Mining Association Rules in Hypertext Databases. In: Agrawal, R. (ed.): Proceedings/The Fourth International Conference on Knowledge Discovery and Data Mining, August 27–31, 1998, New York, Menlo Park, Calif., 149–153.
Google Scholar
BORGES, J. and LEVENE, M. (1999a): Mining Navigation Patterns with Hypertext Probabilistic Grammars. Research Note RN/99/08, Department of Computer Science, University College London, February 1999.
Google Scholar
BORGES, J. and LEVENE, M. (1999b): Data Mining of User Navigation Patterns. In: Proceedings of the Workshop on Web Usage Analysis and User Profiling (WEBKDD’99), August 15, 1999, San Diego, CA, Springer, 31–36.
Google Scholar
BUECHNER, A.G., BAUMGARTEN, M., ANAND, S.S., MULVENNA, M.D., and HUGHES, J.G. (1999): Navigation Pattern Discovery from Internet Data. In: Proceedings of the Workshop on Web Usage Analysis and User Profiling (WEBKDD’99), August 15, 1999, San Diego, CA, Springer, 25–30.
Google Scholar
CHEN, M.-S., PARK, J.S., and YU, P.S. (1996): Data Mining for Path Traversal Patterns in a Web Environment. In: Proceedings of the 16th International Conference on Distributed Computing Systems (ICDCS), May 27–30, 1996, Hong Kong, IEEE Computer Society, 385–392.
Google Scholar
CHEN, M.-S., PARK, J.S., and YU, P.S. (1998): Efficient Data Mining for Path Traversal Patterns. IEEE Transactions on Knowledge I? Data Engineering 10/2 (1998), 209–221.
Google Scholar
COOLEY, R., MOBASHER, B., and SRIVASTAVA, J. (1999a): Web Mining: Information and Pattern Discovery on the World Wide Web. In: 9th International Conference on Tools with Artificial Intelligence (ICTAI ‘87), November 3–8, 1997, Newport Beach, CA.
Google Scholar
COOLEY, R., MOBASHER, B., and SRIVASTAVA, J. (1999b): Data Preparation for Mining World Wide Web Browsing Patterns. Journal of Knowledge and Information Systems 1/1 (1999).
Google Scholar
GAUL, W. and SCHADER, M. (1999): Data Mining: A New Label for an Old Problem? in: Gaul, W. and Schader, M. (Hrsg.): Mathematische Methoden der Wirtschaftswissenschaft, Festschrift für Otto Opitz, Physica-Verlag, Heidelberg, 3–14.
Google Scholar
MOBASHER, B. (2000): Mining Web Usage Data for Automatic Site Personalization. To appear in Studies in Classification, Data Analysis, and Knowledge Organization, 2000.
Google Scholar
SHAHABI, C., ZARKESH, A.M., ADIBI, J., and SHAH, V. (1997): Knowledge Discovery from Users Web-Page Navigation. In: 7th International Workshop on Research Issues in Data Engineering (RIDE ‘87), High Performance Database Management for Large-Scale Applications, April 7–8, 1997, Birmingham, UK.
Google Scholar
SPILIOPOULOU, M. and FAULSTICH, L.C. (1998): WUM: A Tool for Web Utiliziation Analysis. In: Atzeni, P., Mendelzon, A., and Mecca, G. (eds.): The World Wide Web and Databases, International Workshop WebDB’98, Valencia, Spain, March 27–28, 1998, LNCS 1590, Springer, 184–203.
Google Scholar
SPILIOPOULOU, M., FAULSTICH, L.C., and WINKLER, K. (1999): A Data Miner Analyzing the Navigational Behaviour of Web Users. In: Proc. of the Workshop on Machine Learning in User Modelling of the ACAI’99 Int. Conf., Greta, Greece, July 1999.
Google Scholar
SPILIOPOULOU, M. (1999): The Laborious Way from Data Mining to Web Mining. Int. Journal of Comp. Sys., Sci. h Eng. 14 (1999), Special Issue on “Semantics of the Web”, 113–126.
Google Scholar
SRIKANT, R. and AGRAWAL, R. (1996): Mining Sequential Patterns: Generalizations and Performance Improvements. In: Apers, P.M.G., Bouzeghoub, M., and Gardarin, G. (eds.): Advances in Database Technology - EDBT’96, 5th International Conference on Extending Database Technology, Avignon, France, March 25–29, 1996, Proceedings. LNCS 1057, Springer.
Google Scholar
VIVEROS, M.S., ELO-DEAN, S., WRIGHT, M.A., and DURI, S.S. (1997): Visitor’s Behaviour: Mining Web Servers. In: Proceedings of the 1st International Conference on the Practical Application of Knowledge Discovery and Data Mining, Blackpool 1997, 257–269.
Google Scholar
YAN, T.W., JACOBSEN, M., GARCIA-MOLINA, H., and DAYAL, U. (1996): From User Access Patterns to Dynamic Hypertext Linking. In: Fifth International World Wide Web Conference May 6–10, 1996, Paris, France.
Google Scholar
ZAIANE, O.R., XIN, M., and HAN, J. (1998): Discovering Web Access Patterns and Trends by Applying OLAP and Data Mining Technology on Web Logs. In: Advances in Digital Libraries, Santa Barbara 1998, 19–29.
Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Entscheidungstheorie und Unternehmensforschung, Universität Karlsruhe, D-76128, Karlsruhe, Germany
Wolfgang Gaul & Lars Schmidt-Thieme

Authors

Wolfgang Gaul
View author publications
You can also search for this author in PubMed Google Scholar
Lars Schmidt-Thieme
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut für Entscheidungstheorie und Unternehmensforschung, Universität Karlsruhe (TH), Kaiserstraße 12, D-76128, Karlsruhe, Germany
Wolfgang Gaul
Lehrstuhl für Mathematische Methoden der Wirtschaftswissenschaften, Universität Augsburg, Universitätsstraße 16, D-86135, Augsburg, Germany
Otto Opitz
Lehrstuhl für Wirtschaftsinformatik III, Universität Mannheim, Schloß, D-68131, Mannheim, Germany
Martin Schader

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gaul, W., Schmidt-Thieme, L. (2000). Frequent Generalized Subsequences — A Problem From Web Mining. In: Gaul, W., Opitz, O., Schader, M. (eds) Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-58250-9_35

Download citation

DOI: https://doi.org/10.1007/978-3-642-58250-9_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67731-4
Online ISBN: 978-3-642-58250-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics