Skip to main content

Inductive Databases and Constraint-Based Data Mining

  • Conference paper
Formal Concept Analysis (ICFCA 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6628))

Included in the following conference series:

  • 632 Accesses

Abstract

We briefly introduce the notion of an inductive database, explain its relation to constraint-based data mining, and illustrate it on an example. We then discuss constraints and constraint-based data mining in more detail. We further give an overview of recent developments in the area, focusing on those made within the IQ project and presented in a recent volume with the same title as this paper, edited by the author, Bart Goethals and Panče Panov, and published by Springer.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. ACM SIGMOD Conf. on Management of Data, pp. 207–216. ACM Press, New York (1993)

    Google Scholar 

  2. Bayardo, R. (guest ed.): Constraints in data mining. Special issue of SIGKDD Explorations 4(1) (2002)

    Google Scholar 

  3. Becquet, C., Blachon, S., Jeudy, B., Boulicaut, J.-F., Gandrillon, O.: Strong-association-rule mining for large-scale gene-expression data analysis: a case study on human SAGE data. Genome Biology 3(12), research0067 (2002)

    Google Scholar 

  4. Besson, J., Boulicaut, J.-F., Guns, T., Nijssen, S.: Generalizing Itemset Mining in a Constraint Programming Setting. In: [25], pp. 107–126 (2010)

    Google Scholar 

  5. Bingham, E.: Finding Segmentations of Sequences. In: [25], pp. 177–197 (2010)

    Google Scholar 

  6. Bistarelli, S., Bonchi, F.: Interestingness is Not a Dichotomy: Introducing Softness in Constrained Pattern Mining. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 22–33. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  7. Blachon, S., Pensa, R.G., Besson, J., Robardet, C., Boulicaut, J.-F., Gandrillon, O.: Clustering formal concepts to discover biologically relevant knowledge from gene expression data. In Silico Biology 7(4-5), 467–483 (2007)

    Google Scholar 

  8. Blockeel, H., Calders, T., Fromont, E., Goethals, B., Prado, A., Robardet, C.: Inductive Querying with Virtual Mining Views. In: [25], pp. 265–287 (2010b)

    Google Scholar 

  9. Boulicaut, J.-F., Bykowski, A., Rigotti, C.: Free-sets: a condensed representation of boolean data for the approximation of frequency queries. Data Mining and Knowledge Discovery 7(1), 5–22 (2003)

    Article  MathSciNet  Google Scholar 

  10. Boulicaut, J.-F., De Raedt, L., Mannila, H. (eds.): Constraint-Based Mining and Inductive Databases. Springer, Berlin (2005)

    MATH  Google Scholar 

  11. Boulicaut, J.-F., Jeudy, B.: Constraint-based data mining. In: Maimon, O., Rokach, L. (eds.) The Data Mining and Knowledge Discovery Handbook, pp. 399–416. Springer, Berlin (2005)

    Chapter  Google Scholar 

  12. Boulicaut, J.-F., Klemettinen, M., Mannila, H.: Modeling KDD processes within the inductive database framework. In: Mohania, M., Tjoa, A.M. (eds.) DaWaK 1999. LNCS, vol. 1676, pp. 293–302. Springer, Heidelberg (1999)

    Google Scholar 

  13. Bringmann, B., Nijssen, S., Zimmermann, A.: From Local Patterns to Classification Models. In: [25], pp. 127–154 (2010)

    Google Scholar 

  14. Bringmann, B., Zimmermann, A., De Raedt, L., Nijssen, S.: Don’t be afraid of simpler patterns. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 55–66. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  15. Calders, T., Goethals, B., Prado, A.B.: Integrating pattern mining in relational databases. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 454–461. Springer, Heidelberg (2006a)

    Chapter  Google Scholar 

  16. Calders, T., Lakshmanan, L.V.S., Ng, R.T., Paredaens, J.: Expressive power of an algebra for data mining. ACM Transactions on Database Systems 31(4), 1169–1214 (2006b)

    Article  Google Scholar 

  17. Calders, T., Rigotti, C., Boulicaut, J.-F.: A survey on condensed representations for frequent sets. In: Boulicaut, J.-F., De Raedt, L., Mannila, H. (eds.) Constraint-Based Mining and Inductive Databases. LNCS (LNAI), vol. 3848, pp. 64–80. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  18. Cerf, L., Besson, J., Robardet, C., Boulicaut, J.-F.: Data-Peeler: Constraint-based closed pattern mining in n-ary relations. In: Proc. 8th SIAM Intl. Conf. on Data Mining, pp. 37–48. SIAM, Philadelphia (2008)

    Google Scholar 

  19. Cerf, L., Nhan Nguyen, B.T., Boulicaut, J.-F.: Mining Constrained Cross-Graph Cliques in Dynamic Networks. In: [25], pp. 199–228 (2010)

    Google Scholar 

  20. De Raedt, L.: A perspective on inductive databases. SIGKDD Explorations 4(2), 69–77 (2002a)

    Article  Google Scholar 

  21. De Raedt, L.: Data mining as constraint logic programming. In: Kakas, A.C., Sadri, F. (eds.) Computational Logic: Logic Programming and Beyond. LNCS (LNAI), vol. 2408, pp. 113–125. Springer, Heidelberg (2002b)

    Google Scholar 

  22. De Raedt, L., Guns, T., Nijssen, S.: Constraint programming for itemset mining. In: Proc. 14th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, pp. 204–212. ACM Press, New York (2008)

    Google Scholar 

  23. De Raedt, L., Kimmig, A., Gutmann, B., Kersting, K., Santos Costa, V., Toivonen, H.: Probabilistic Inductive Querying Using ProbLog. In: [25], pp. 229–262 (2010)

    Google Scholar 

  24. Džeroski, S.: Towards a general framework for data mining. In: Džeroski, S., Struyf, J. (eds.) KDID 2006. LNCS, vol. 4747, pp. 259–300. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  25. Džeroski, S., Goethals, B., Panov, P. (eds.): Inductive Databases and Constraint-Based Data Mining. Springer, Berlin (2010)

    MATH  Google Scholar 

  26. Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery: An overview. In: Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 495–515. MIT Press, Cambridge (1996)

    Google Scholar 

  27. Fayyad, U., Piatetsky-Shapiro, G., Uthurusamy, R.: Summary from the KDD-2003 panel – “Data Mining: The Next 10 Years”. SIGKDD Explorations 5(2), 191–196 (2003)

    Article  Google Scholar 

  28. Garriga, G.C., Khardon, R., De Raedt, L.: On mining closed sets in multirelational data. In: Proc. 20th Intl. Joint Conf. on Artificial Intelligence, pp. 804–809. AAAI Press, Menlo Park (2007)

    Google Scholar 

  29. Gionis, A., Mannila, H., Mielikainen, T., Tsaparas, P.: Assessing data mining results via swap randomization. In: Proc. 12th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, pp. 167–176. ACM Press, New York (2006)

    Chapter  Google Scholar 

  30. Haiminen, N., Mannila, H.: Discovering isochores by least-squares optimal segmentation. Gene 394(1-2), 53–60 (2007)

    Article  Google Scholar 

  31. Han, J., Lakshmanan, L.V.S., Ng, R.T.: Constraint-Based Multidimensional Data Mining. IEEE Computer 32(8), 46–50 (1999)

    Article  Google Scholar 

  32. Hand, D.J., Mannila, H., Smyth, P.: Principles of Data Mining. MIT Press, Cambridge (2001)

    Google Scholar 

  33. Imielinski, T., Mannila, H.: A database perspective on knowledge discovery. Communications of the ACM 39(11), 58–64 (1996)

    Article  Google Scholar 

  34. Johnson, T., Lakshmanan, L.V., Ng, R.: The 3W model and algebra for unified data mining. In: Proc. of the Intl. Conf. on Very Large Data Bases, pp. 21–32. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  35. King, R.D., Schierz, A., Clare, A., Rowland, J., Sparkes, A., Nijssen, S., Ramon, J.: Inductive Queries for a Drug Designing Robot Scientist. In: [25], pp. 425–453 (2010)

    Google Scholar 

  36. Kramer, S., De Raedt, L., Helma, C.: Molecular feature mining in HIV data. In: Proc. 7th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, pp. 136–143. ACM Press, New York (2001)

    Google Scholar 

  37. Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 1(3), 241–258 (1997)

    Article  Google Scholar 

  38. Meo, R.: Optimization of a language for data mining. In: Proc. 18th ACM Symposium on Applied Computing, pp. 437–444. ACM Press, New York (2003)

    Google Scholar 

  39. Mitchell, T.M.: Generalization as search. Artificial Intelligence 18(2), 203–226 (1982)

    Article  MathSciNet  Google Scholar 

  40. Nijssen, S., De Raedt, L.: IQL: A proposal for an inductive query language. In: Džeroski, S., Struyf, J. (eds.) KDID 2006. LNCS, vol. 4747, pp. 189–207. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  41. Panov, P., Soldatova, L., Džeroski, S.: Representing Entities in the OntoDM Data Mining Ontology. In: [25], pp. 29–58 (2010)

    Google Scholar 

  42. Pečkov, A., Džeroski, S., Todorovski, L.: Multi-target polynomial regression with constraints. In: Proc. Intl. Wshp. on Constrained-Based Mining and Learning, ECML/PKDD, Warsaw, pp. 61–72 (2007)

    Google Scholar 

  43. Pensa, R.G., Robardet, C., Boulicaut, J.-F.: Constraint-driven co-clustering of 0/1 data. In: Basu, S., Davidson, I., Wagstaff, K. (eds.) Constrained Clustering: Advances in Algorithms, Theory and Applications, pp. 145–170. Chapman & Hall/CRC Press, Boca Raton, FL (2008)

    Google Scholar 

  44. Rigotti, C., Mitašiūnaitė, I., Besson, J., Meyniel, L., Boulicaut, J.-F., Gandrillon, O.: Using a Solver Over the String Pattern Domain to Analyze Gene Promoter Sequences. In: [25], pp. 407–423 (2010)

    Google Scholar 

  45. Slavkov, I., Džeroski, S.: Analyzing Gene Expression Data with Predictive Clustering Trees. In: [25], pp. 389–406 (2010)

    Google Scholar 

  46. Struyf, J., Džeroski, S.: Constrained Predictive Clustering. In: [25], pp. 155–175 (2010)

    Google Scholar 

  47. Vanschoren, J., Blockeel, H.: Experiment Databases. In: [25], pp. 335–361 (2010)

    Google Scholar 

  48. Vens, C., Schietgat, L., Struyf, J., Blockeel, H., Kocev, D., Džeroski, S.: Predicting Gene Function using Predictive Clustering Trees. In: [25], pp. 365–387 (2010)

    Google Scholar 

  49. Wagstaff, K., Cardie, C.: Clustering with instance-level constraints. In: Proc. 17th Intl. Conf. on Machine Learning, pp. 1103–1110. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  50. Yang, Q., Wu, X.: 10 Challenging problems in data mining research. International Journal of Information Technology & Decision Making 5(4), 597–604 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Džeroski, S. (2011). Inductive Databases and Constraint-Based Data Mining. In: Valtchev, P., Jäschke, R. (eds) Formal Concept Analysis. ICFCA 2011. Lecture Notes in Computer Science(), vol 6628. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20514-9_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20514-9_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20513-2

  • Online ISBN: 978-3-642-20514-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics