Advertisement

Predictive Sequence Miner in ILP Learning

  • Carlos Abreu Ferreira
  • João Gama
  • Vítor Santos Costa
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7207)

Abstract

This work presents an optimized version of XMuSer, an ILP based framework suitable to explore temporal patterns available in multi-relational databases. XMuSer’s main idea consists of exploiting frequent sequence mining, an efficient method to learn temporal patterns in the form of sequences. XMuSer framework efficiency is grounded on a new coding methodology for temporal data and on the use of a predictive sequence miner. The frameworks selects and map the most interesting sequential patterns into a new table, the sequence relation. In the last step of our framework, we use an ILP algorithm to learn a classification theory on the enlarged relational database that consists of the original multi-relational database and the new sequence relation.

We evaluate our framework by addressing three classification problems and map each one of three different types of sequential patterns: frequent, closed or maximal. The experiments show that our ILP based framework gains both from the descriptive power of the ILP algorithms and the efficiency of the sequential miners.

Keywords

Sequence Database Sequential Pattern Inductive Logic Programming Frequent Sequence Mining Sequential Pattern 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Blockeel, H., Sebag, M.: Scalability and efficiency in multi-relational data mining. SIGKDD Explorations 5(1), 17–30 (2003)CrossRefGoogle Scholar
  2. 2.
    Costa, V.S.: The Life of a Logic Programming System. In: Garcia de la Banda, M., Pontelli, E. (eds.) ICLP 2008. LNCS, vol. 5366, pp. 1–6. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  3. 3.
    Davis, J., Burnside, E., Ramakrishnan, R., Costa, V., Shavlik, J.: View learning for statistical relational learning: With an application to mammography. In: Proceeding of the 19th International Joint Conference on Artificial Intelligence, pp. 677–683. Professional Book Center, Edinburgh (2005)Google Scholar
  4. 4.
    Dehaspe, L., Toivonen, H.: Discovery of frequent DATALOG patterns. Data Mining and Knowledge Discovery 3(1), 7–36 (1999)CrossRefGoogle Scholar
  5. 5.
    Esposito, F., Di Mauro, N., Basile, T.M.A., Ferilli, S.: Multi-dimensional relational sequence mining. Fundamenta Informaticae 89(1), 23–43 (2009)Google Scholar
  6. 6.
    Ferreira, C.A., Gama, J., Costa, V.S.: Sequential Pattern Mining in Multi-relational Datasets. In: Meseguer, P., Mandow, L., Gasca, R.M. (eds.) CAEPIA 2009. LNCS, vol. 5988, pp. 121–130. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  7. 7.
    Garofalakis, M., Rastogi, R., Shim, K.: Mining sequential patterns with regular expression constraints. IEEE Transactions on Knowledge and Data Engineering 14(3), 530–552 (2002)CrossRefGoogle Scholar
  8. 8.
    Dan Lee, S., De Raedt, L.: Constraint Based Mining of First Order Sequences in SeqLog. In: Meo, R., Lanzi, P.L., Klemettinen, M. (eds.) Database Support for Data Mining Applications. LNCS (LNAI), vol. 2682, pp. 154–173. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  9. 9.
    Muggleton, S.: Inverse entailment and Progol. New Generation Computing, Special Issue on Inductive Logic Programming 13(3&4), 245–286 (1995)Google Scholar
  10. 10.
    Muggleton, S., Feng, C.: Efficient induction of logic programs. In: First International Workshop on Algorithmic Learning Theory, pp. 368–381. Springer/Ohmsha, Tokyo, Japan (1990)Google Scholar
  11. 11.
    Novak, P.K., Lavrač, N., Webb, G.I.: Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining. Journal Machine Learning Research 10, 377–403 (2009)zbMATHGoogle Scholar
  12. 12.
    Ohara, K., Yoshida, T., Geamsakul, W., Motoda, H., Washio, T., Yokoi, H., Takabayashi, K.: Analysis of Hepatitis Dataset by Decision Tree Graph-Based Induction (2004)Google Scholar
  13. 13.
    Quinlan, J.R., Cameron-Jones, R.M.: Induction of logic programs: Foil and related systems. New Generation Computing 13, 287–312 (1995)CrossRefGoogle Scholar
  14. 14.
    Srikant, R., Agrawal, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 3–17. Springer, Heidelberg (1996)Google Scholar
  15. 15.
  16. 16.
    Yan, X., Han, J., Afshar, R.: Clospan: Mining closed sequential patterns in large datasets. In: Proceedings of the Third SIAM International Conference on Data Mining, pp. 166–177. SIAM, San Francisco (2003)Google Scholar
  17. 17.
    Zaki, M.J.: Sequence mining in categorical domains: Incorporating constraints. In: CIKM, pp. 422–429 (2000)Google Scholar
  18. 18.
    Zaki, M.J.: Spade: An efficient algorithm for mining frequent sequences. Machine Learning 1(42), 31–60 (2001)CrossRefGoogle Scholar
  19. 19.
    Zelezny, F., Lavrac, N.: Propositionalization-Based Relational Subgroup Discovery with RSD. Machine Learning 62(1-2), 33–63 (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Carlos Abreu Ferreira
    • 1
  • João Gama
    • 2
  • Vítor Santos Costa
    • 3
  1. 1.LIAAD-INESC and ISEPPolytechnic Institute of PortoPortoPortugal
  2. 2.LIAAD-INESC and Faculty of EconomicsUniversity of PortoPortoPortugal
  3. 3.CRACS-INESC and Faculty of SciencesUniversity of PortoPortoPortugal

Personalised recommendations