Skip to main content

Constraint Based Mining of First Order Sequences in SeqLog

  • Chapter

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2682))

Abstract

A logical language, SeqLog, for mining and querying sequential data and databases is presented. In SeqLog, data takes the form of a sequence of logical atoms, background knowledge can be specified using Datalog style clauses and sequential queries or patterns correspond to subsequences of logical atoms. SeqLog is then used as the representation language for the inductive database mining system MineSeqLog. Inductive queries in MineSeqLog take the form of a conjunction of a monotonic and an anti-monotonic constraint on sequential patterns. Given such an inductive query, MineSeqLog computes the borders of the solution space. MineSeqLog uses variants of the famous level-wise algorithm together with ideas from version spaces to realize this. Finally, we report on a number of experiments in the domains of user-modelling that validate the approach.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Srikant, R.: Mining sequential patterns. In: Yu, P.S., Chen, A.L.P. (eds.) Proc. 11th Int. Conf. Data Engineering, ICDE, pp. 3–14. IEEE Press, Los Alamitos (1995)

    Google Scholar 

  2. Srikant, R., Agrawal, R.: Mining sequential patterns: Generalizations and performance improvements. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 3–17. Springer, Heidelberg (1996)

    Google Scholar 

  3. Mannila, H., Toivonen, H., Verkamo, A.I.: Discovering frequent episodes in sequences. In: Fayyad, U.M., Uthurusamy, R. (eds.) First International Conference on Knowledge Discovery and Data Mining (KDD 1995) (1995)

    Google Scholar 

  4. Garofalakis, M.N., Rastogi, R., Shim, K.: SPIRIT: Sequential pattern mining with regular expression constraints. In: Proceedings of the 25th International Conference on Very Large Data Bases (VLDB 1999), pp. 223–234. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  5. Wang, K.: Discovering patterns from large and dynamic sequential data. Journal of Intelligent Information Systems 9, 33–56 (1997)

    Article  Google Scholar 

  6. Zaki, M.J.: Fast mining of sequential patterns in very large databases. Technical Report 668, Computer Science, University of Rochester, PO Box 270226, Rochester, NY 14627, U.S.A. (1997)

    Google Scholar 

  7. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Bocca, J.B., Jarke, M., Zaniolo, C. (eds.) Proc. 20th Int. Conf. Very Large Data Bases, VLDB, pp. 487–499. Morgan Kaufmann, San Francisco (1994)

    Google Scholar 

  8. Jacobs, N., Blockeel, H.: From shell logs to shell scripts. In: Rouveirol, C., Sebag, M. (eds.) ILP 2001. LNCS (LNAI), vol. 2157, pp. 80–90. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  9. Mannila, H., Toivonen, H.: Discovering generalized episodes using minimal occurrences. In: Simoudis, E., Han, J.W., Fayyad, U. (eds.) Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD 1996), p. 146. AAAI Press, Menlo Park (1996)

    Google Scholar 

  10. Kersting, K., Raiko, T., Kramer, S., De Raedt, L.: Towards discovering structural signatures of protein folds based on logical hidden markov models. In: Proceedings of the Pacific Symposium on Biocomputing (PSB-2003), Kauai, Hawaii, U.S.A. (2003)

    Google Scholar 

  11. Kramer, S., De Raedt, L., Helma, C.: Molecular feature mining in hiv data. In: KDD-2001: The Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery (2001), ISBN: 158113391X

    Google Scholar 

  12. Hirsh, H.: Theoretical underpinnings of version spaces. In: Proceedings of the Twelfth International Joint Conference on Artificial Intelligence (IJCAI 1991), pp. 665–670. Morgan Kaufmann Publishers, San Francisco (1991)

    Google Scholar 

  13. Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 1, 241–258 (1997)

    Article  Google Scholar 

  14. Bayardo, R.: Efficiently mining long patterns from databases. In: Proceedings of ACM SIGMOD Conference on Management of Data (1998)

    Google Scholar 

  15. Mitchell, T.: Generalization as search. Artificial Intelligence 18, 203–226 (1980)

    Article  MathSciNet  Google Scholar 

  16. De Raedt, L., Kramer, S.: The levelwise version space algorithm and its application to molecular fragment finding. In: IJCAI 2001: Seventeenth International Joint Conference on Artificial Intelligence (2001)

    Google Scholar 

  17. Mellish, C.: The description identification algorithm. Artificial Intelligence (1990)

    Google Scholar 

  18. Nienhuys-Cheng, S.-H., de Wolf, R.: Foundations of Inductive Logic Programming. LNCS, vol. 1228. Springer, Heidelberg (1997)

    MATH  Google Scholar 

  19. Nijssen, S., Kok, J.N.: Faster association rules for multiple relations. In: IJCAI, pp. 891–896 (2001)

    Google Scholar 

  20. Dehaspe, L., Toivonen, H.: Discovery of frequent datalog patterns. Data Mining and Knowledge Discovery Journal 3 (1999)

    Google Scholar 

  21. Greenberg, S.: Using unix: Collected traces of 168 users. Research Report 88/333/45, Department of Computer Science, University of Calgary, Calgary, Canada (1988)

    Google Scholar 

  22. Masson, C., Jacquenet, F.: Mining frequent logical sequences with spirit-log. In: Matwin, S., Sammut, C. (eds.) ILP 2002. LNCS (LNAI), vol. 2583, Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  23. De Raedt, L.: A logical database mining query language. In: Cussens, J., Frisch, A.M. (eds.) ILP 2000. LNCS (LNAI), vol. 1866, pp. 78–92. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  24. Bonner, A.J., Mecca, G.: Sequence datalog: Declarative string manipulation in databases. In: Logic in Databases, pp. 399–413 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Dan Lee, S., De Raedt, L. (2004). Constraint Based Mining of First Order Sequences in SeqLog. In: Meo, R., Lanzi, P.L., Klemettinen, M. (eds) Database Support for Data Mining Applications. Lecture Notes in Computer Science(), vol 2682. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-44497-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-44497-8_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22479-2

  • Online ISBN: 978-3-540-44497-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics