Deterministic Extraction of Compact Sets of Rules for Subgroup Discovery

  • Juan L. Domínguez-OlmedoEmail author
  • Jacinto Mata Vázquez
  • Victoria Pachón
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9375)


This work presents a novel deterministic method to obtain rules for Subgroup Discovery tasks. It makes no previous discretization for the numeric attributes, but their conditions are obtained dynamically. To obtain the final rules, the AUC value of a rule has been used for selecting them. An experimental study supported by appropriate statistical tests was performed, showing good results in comparison with the classic deterministic algorithms CN2-SD and APRIORI-SD. The best results were obtained in the number of induced rules, where a significant reduction was achieved. Also, better coverage and less number of attributes were obtained in the comparison with CN2-SD.


Data mining Machine learning Rule-based systems 



This work was partially funded by the Regional Government of Andalusia (Junta de Andalucía), grant number TIC-7629.


  1. 1.
    Bay, S.D., Pazzani, M.J.: Detecting group differences. Mining contrast sets. Data Min. Knowl. Discov. 5(3), 213–246 (2001)CrossRefzbMATHGoogle Scholar
  2. 2.
    Dong, G., Li, J.: Efficient mining of emerging patterns. Discovering trends and differences. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 43–52 (1999)Google Scholar
  3. 3.
    Klösgen, W.: Explora: A multipattern and multistrategy discovery assistant. Advances in Knowledge Discovery and Data Mining, pp. 249–271. American Association for Artificial Intelligence, Cambridge (1996)Google Scholar
  4. 4.
    Wrobel, S.: An algorithm for multi-relational discovery of subgroups. In: Proceedings of the 1st European Conference on Principles of Data Mining and Knowledge Discovery (PKDD-97), pp 78–87 (1997)Google Scholar
  5. 5.
    Novak, P.N., Lavrač, N., Webb, G.: Supervised descriptive rule discovery: a unifying survey of contrast set, emerging pattern and subgroup mining. J. Mach. Learn. Res. 10, 377–403 (2009)zbMATHGoogle Scholar
  6. 6.
    Lavrač, N., Kavsek, B., Flach, P.A., Todorovski, L.: Subgroup discovery with CN2-SD. J. Mach. Learn. Res. 5, 153–188 (2004)MathSciNetGoogle Scholar
  7. 7.
    Kavsek, B., Lavrač, N.: APRIORI-SD: adapting association rule learning to subgroup discovery. Appl. Artif. Intell. 20(7), 543–583 (2006)CrossRefGoogle Scholar
  8. 8.
    Atzmüller, M., Puppe, F.: SD-Map – a fast algorithm for exhaustive subgroup discovery. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 6–17. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  9. 9.
    Carmona, C.J., González, P., del Jesus, M.J., Herrera, F.: NMEEF-SD: non-dominated multi-objective evolutionary algorithm for extracting fuzzy rules in subgroup discovery. IEEE Trans. Fuzzy Syst. 18(5), 958–970 (2010)CrossRefGoogle Scholar
  10. 10.
    Rodríguez, D., Ruiz, R., Riquelme, J.C., Aguilar-Ruiz, J.S.: Searching for rules to detect defective modules: a subgroup discovery approach. Inf. Sci. 191, 14–30 (2012)CrossRefGoogle Scholar
  11. 11.
    Carmona, C.J., Ruiz-Rodado, V., del Jesus, M.J., Weber, A., Grootveld, M., González, P., Elizondo, D.: A fuzzy genetic programming-based algorithm for subgroup discovery and the application to one problem of pathogenesis of acute sore throat conditions in humans. Inf. Sci. 298, 180–197 (2015)CrossRefGoogle Scholar
  12. 12.
    Grosskreutz, H., Rüping, S.: On subgroup discovery in numerical domains. Data Min. Knowl. Discov. 19(2), 210–226 (2009)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Fayyad, U., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: 13th International Joint Conference on Artificial Intelligence, pp. 1022–1029 (1999)Google Scholar
  14. 14.
    Domínguez-Olmedo, J.L., Mata, J., Pachón, V., Maña, M.J.: A deterministic approach to association rule mining without attribute discretization. In: Snasel, V., Platos, J., El-Qawasmeh, E. (eds.) ICDIPC 2011, Part I. CCIS, vol. 188, pp. 140–150. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  15. 15.
    Lichman, M.: UCI Machine Learning Repository. School of Information and Computer Science, University of California, Irvine, CA (2013).
  16. 16.
    Alcalá-Fdez, J., Fernandez, A., Luengo, J., Derrac, J., García, S., Sánchez, L., Herrera, F.: KEEL data-mining software tool. J. Multiple-Valued Logic Soft Comput. 17, 255–287 (2011)Google Scholar
  17. 17.
    Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Juan L. Domínguez-Olmedo
    • 1
    Email author
  • Jacinto Mata Vázquez
    • 1
  • Victoria Pachón
    • 1
  1. 1.Escuela Técnica Superior de IngenieríaUniversity of HuelvaHuelvaSpain

Personalised recommendations