Discovering Patterns from Large and Dynamic Sequential Data

Wang, Ke

doi:10.1023/A:1008689103430

Discovering Patterns from Large and Dynamic Sequential Data

Published: July 1997

Volume 9, pages 33–56, (1997)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Ke Wang¹

102 Accesses
33 Citations
Explore all metrics

Abstract

Most daily and scientific data are sequential in nature. Discoveringimportant patterns from such data can benefit the user and scientist bypredicting coming activities, interpreting recurring phenomena, extractingoutstanding similarities and differences for close attention, compressingdata, and detecting intrusion. We consider the following incrementaldiscovery problem for large and dynamic sequential data. Suppose thatpatterns were previously discovered and materialized. An update is made tothe sequential database. An incremental discovery will take advantage ofdiscovered patterns and compute only the change by accessing the affectedpart of the database and data structures. In addition to patterns, thestatistics and position information of patterns need to be updated to allowfurther analysis and processing on patterns. We present an efficientalgorithm for the incremental discovery problem. The algorithm is applied tosequential data that honors several sequential patterns modeling weatherchanges in Singapore. The algorithm finds what it is supposed to find.Experiments show that for small updates and large databases, the incrementaldiscovery algorithm runs in time independent of the data size.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agrawal, R. and Srikant, R. (1995). Mining sequential patterns. IEEE Conference on Data Engineering(pp. 3–14).
Agrawal, R., Lin, K.I., Sawhney, H.S., and Shim, K. (1995). Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases, VLDB, 490–501.
Baeza-Yates, R. (1992). Text Retrieval: Theory and Practice, Algorithms, Software, Architecture: Information Processing, 1, 465–476.
Google Scholar
Baeza-Yates, R. and Gonnet, G.H. (1992). A New Approach to Text Searching, CACM, 35(10), 74–82.
Google Scholar
Boyer, R.S. and Moore, J.S. (1977). A Fast String Searching Algorithm, CACM, 20(10), 762–772.
Google Scholar
Cobbs, A.L. (1995). Fast approximate matching using suffix trees. In Proc. Combinatorial Pattern Matching, Lecture Notes in Computer Science (vol. 937, pp. 41–54), Springer-Verlag.
Google Scholar
Dietterich, T.G. and Michalski, R.S. (1985). Discovering Patterns in Strings of Events, Artificial Intelligence, 25, 187–232.
Google Scholar
Faloutsos, C. (1985). Access Methods for Text, ACM Computing Surveys, 17, 49–74.
Google Scholar
Fayyad, U., Shapiro, G.P., and Smyth, P. (1996). Knowledge Discovery and Data Mining: Towards a Unifying Framework, KDD, 82–88.
Gonnet, G.H. and Baeza-Yates, R. (1991). Handbook of Algorithms and Data Structures in Pascal and C, second edition.
Guttman, A. (1984). R-trees: A Dynamic Index Structure for Spatial Searching, ACM SIGMOD, 47–57.
Hui, L.C.K. (1992). Color Set Size Problem with Applications to String Matching. In A. Apostolico et al. (Eds.), Combinatorial Patterns Matching, Lecture Notes in Computer Science, 644, 230–243, Springer-Verlag.
Knuth, D.E., Morris, J.H., and Pratt, V.R. (1977). Fast Pattern Matching in Strings, SIAM J. Comput.6, 323–350.
Google Scholar
Landau, G.M. and Vishkin, U. (1989). Fast Parallel and Serial Approximate String Matching, Journal of Algorithms, 10(2), 157–169.
Google Scholar
McCreight, E.M. (1976). A Space-Economical Suffix Tree Construction Algorithm, JACM, 23(2), 262–272.
Google Scholar
Sellis, T., Roussopoulos, N., and Faloutsos, C. (1987). The R+-tree: A Dynamic Index for Multi-Dimensional Objects, VLDB, 507–518.
Stephen, G.A. (1994). String Searching Algorithms, Lectures Notes Series on Computing, World Scientific, 3.
Tomasic, A., Garcia-Molina, H., and Shoens, K. (1994). Incremental Updates of Inverted Lists for Text Document Retrievals, ACM SIGMOD.
Ukkonen, E. (1992). Constructing Suffix-Trees On-Line in Linear Time, Algorithms, Software, Architecture: Information Processing 92, Amsterdam: Elsevier, 1, 484–492.
Google Scholar
Ukkonen, E., (1993). Approximate matching over suffix trees. In Proc. Combinatorial Pattern Matching(vol. 4, pp. 228–242), Springer-Verlag.
Google Scholar
Wang, J.T.L., Chirn, G.W., Marr, T.G., Shapiro, B., Shasha, D., and Zhang, K. (1994). Combinatorial Pattern Discovery for Scientific Sata: Some Preliminary Results, ACM SIGMOD, 115–125.
Weiner, P. (1973). Linear pattern matching algorithms, Conf. Record, IEEE 14th Annual Symposium on Switching and Automata Theory(pp. 1–11).
Wu, S. and Manber, U. (1992). Fast Text Searching Allowing Errors, CACM, 35(10), 83–91.
Google Scholar
Zobel, J., Moffat, A., and Sacks-Davis, R. (1993). Searching Large Lexicons for Partially Specified Terms using Compressed Inverted Files, VLDB, 290–301.

Download references

Author information

Authors and Affiliations

Department of Information Systems and Computer Science, National University of Singapore, Lower Kent Ridge Road, Singapore, 119260
Ke Wang

Authors

Ke Wang
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, K. Discovering Patterns from Large and Dynamic Sequential Data. Journal of Intelligent Information Systems 9, 33–56 (1997). https://doi.org/10.1023/A:1008689103430

Download citation

Issue Date: July 1997
DOI: https://doi.org/10.1023/A:1008689103430

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discovering Patterns from Large and Dynamic Sequential Data

Abstract

Access this article

Similar content being viewed by others

A sequential tree approach for incremental sequential pattern mining

Pattern-Growth Methods

Mining sequential patterns with itemset constraints

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Discovering Patterns from Large and Dynamic Sequential Data

Abstract

Access this article

Similar content being viewed by others

A sequential tree approach for incremental sequential pattern mining

Pattern-Growth Methods

Mining sequential patterns with itemset constraints

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation