Skip to main content

Subgroup Discovery Using Bump Hunting on Multi-relational Histograms

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7207))

Abstract

We propose an approach to subgroup discovery in relational databases containing numerical attributes. The approach is based on detecting bumps in histograms constructed from substitution sets resulting from matching a first-order query against the input relational database. The approach is evaluated on seven data sets, discovering interpretable subgroups. The subgroups’ rate of survival from the training split to the testing split varies among the experimental data sets, but at least on three of them it is very high.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Atzmueller, M., Lemmerich, F.: Fast Subgroup Discovery for Continuous Target Concepts. In: Rauch, J., Raś, Z.W., Berka, P., Elomaa, T. (eds.) ISMIS 2009. LNCS, vol. 5722, pp. 35–44. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  2. Berka, P., Sochorová, M.: Guide to the financial data set (1999), http://lisp.vse.cz/pkdd99/berka.html

  3. Escobar, M.D., West, M.: Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association 90, 577–588 (1994)

    Article  MathSciNet  Google Scholar 

  4. Friedman, J.H., Fisher, N.I.: Bump hunting in high-dimensional data. Statistics and Computing 9, 123–143 (1999)

    Article  Google Scholar 

  5. Grosskreutz, H., Rüping, S.: On subgroup discovery in numerical domains. Data Mining and Knowledge Discovery 19, 210–226 (2009), doi:10.1007/s10618-009-0136-3

    Article  MathSciNet  Google Scholar 

  6. Kavšek, B., Lavrač, N.: APRIORI-SD: adapting association rule learning to subgroup discovery. Applied Artificial Intelligence 20(7), 543–583 (2006), http://www.tandfonline.com/doi/abs/10.1080/08839510600779688

    Article  Google Scholar 

  7. Klösgen, W.: Explora: a multipattern and multistrategy discovery assistant. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 249–271. American Association for Artificial Intelligence, Menlo Park (1996), http://dl.acm.org/citation.cfm?id=257938.257965

    Google Scholar 

  8. Kralj-Novak, P., Lavrač, N., Webb, G.I.: Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining. Journal of Machine Learning Research 10, 377–403 (2009)

    Google Scholar 

  9. Krogel, M.-A., Wrobel, S.: Transformation-Based Learning Using Multirelational Aggregation. In: Rouveirol, C., Sebag, M. (eds.) ILP 2001. LNCS (LNAI), vol. 2157, pp. 142–155. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  10. Kuželka, O., Szabóová, A., Holec, M., Železný, F.: Gaussian Logic for Predictive Classification. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011. LNCS, vol. 6912, pp. 277–292. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  11. Landwehr, N., Passerini, A., De Raedt, L., Frasconi, P.: Fast learning of relational kernels. Machine Learning 78(3), 305–342 (2010)

    Article  Google Scholar 

  12. Landwehr, N., Kersting, K., De Raedt, L.: nFOIL: integrating naive bayes and FOIL. In: Proceedings of the 20th National Conference on Artificial Intelligence, vol. 2, pp. 795–800. AAAI Press (2005), http://dl.acm.org/citation.cfm?id=1619410.1619460

  13. Landwehr, N., Passerini, A., De Raedt, L., Frasconi, P.: kFOIL: learning simple relational kernels. In: Proceedings of the 21st National Conference on Artificial Intelligence, vol. 1, pp. 389–394. AAAI Press (2006), http://dl.acm.org/citation.cfm?id=1597538.1597601

  14. Lavrač, N., Kavšek, B., Flach, P., Todorovski, L.: Subgroup discovery with CN2-SD. Journal of Machine Learning Research 5, 153–188 (2004)

    Google Scholar 

  15. Lowthian, P., Thompson, M.: Bump-hunting for the proficiency tester – searching for multimodality. The Analyst 127(10), 1359–1364 (2002)

    Article  Google Scholar 

  16. De Raedt, L.: Logical and relational learning. Springer (October 2008)

    Google Scholar 

  17. Silverman, B.W.: Using kernel density estimates to investigate multimodality. Journal of the Royal Statistical Society 43(1), 97–99 (1981)

    MathSciNet  Google Scholar 

  18. Srinivasan, A., Muggleton, S.H., Sternberg, M.J.E., King, R.D.: Theories for mutagenicity: A study in first-order and feature-based induction. Artificial Intelligence 85, 277–299 (1996)

    Article  Google Scholar 

  19. Železný, F., Lavrač, N.: Propositionalization-based relational subgroup discovery with RSD. Machine Learning 62(1-2), 33–63 (2006)

    Article  Google Scholar 

  20. Wrobel, S.: An Algorithm for Multi-Relational Discovery of Subgroups. In: Komorowski, J., Żytkow, J.M. (eds.) PKDD 1997. LNCS, vol. 1263, pp. 78–87. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  21. Yukizane, T., Ohi, S.Y., Miyano, E., Hirose, H.: The bump hunting method using the genetic algorithm with the extreme-value statistics. IEICE - Trans. Inf. Syst. E89-D, 2332–2339 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Černoch, R., Železný, F. (2012). Subgroup Discovery Using Bump Hunting on Multi-relational Histograms. In: Muggleton, S.H., Tamaddoni-Nezhad, A., Lisi, F.A. (eds) Inductive Logic Programming. ILP 2011. Lecture Notes in Computer Science(), vol 7207. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31951-8_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31951-8_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31950-1

  • Online ISBN: 978-3-642-31951-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics