Advertisement

Supervised Local Pattern Mining

  • Sebastián Ventura
  • José María Luna
Chapter

Abstract

Pattern mining is considered as a really interesting task for the extraction of hidden knowledge in the form of patterns. The extraction of such subsequences, substructures or itemsets that represent any type of homogeneity and regularity in data has been carried out from unlabeled data. However, there are many research areas that aim at discovering patterns in the form of rules induced from labeled data. Hence, it is interesting to discover patterns and associations from a supervised point of view since a single item or a set of them can be considered as distinctive. This task, which is known as supervised local pattern mining, is described in this chapter, including different areas such as contrast set mining, emerging pattern mining, and subgroup discovery. This chapter is mainly focused on the subgroup discovery task, which is widely known in the field of supervised local pattern mining. Here, an exhaustive description about this task is provided, including some important quality measures in the field. Then, this chapter includes different evolutionary approaches for subgroup discovery. Finally, this chapter includes an analysis of other different approaches proposed for mining and quantifying local patterns.

Keywords

Quality Measure Association Rule Pattern Mining Target Variable Association Rule Mining 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    T. Abudawood and P. Flach. Evaluation measures for multi-class subgroup discovery. In W. Buntine, M. Grobelnik, D. Mladenić, and J. Shawe-Taylor, editors, Machine Learning and Knowledge Discovery in Databases, volume 5781 of Lecture Notes in Computer Science, pages 35–50. Springer Berlin Heidelberg, 2009.CrossRefGoogle Scholar
  2. 2.
    C. C. Aggarwal and J. Han. Frequent Pattern Mining. Springer International Publishing, 2014.CrossRefzbMATHGoogle Scholar
  3. 3.
    R. Agrawal, T. Imielinski, and A. N. Swami. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD Conference ’93, pages 207–216, Washington, DC, USA, 1993.Google Scholar
  4. 4.
    J. Alípio, F. Pereira, and P. J. Azevedo. Visual interactive subgroup discovery with numerical properties of interest. In L. Todorovski, N. Lavrač, and K. Jantke, editors, Discovery Science, volume 4265 of Lecture Notes in Computer Science, pages 301–305. Springer Berlin Heidelberg, 2006.CrossRefGoogle Scholar
  5. 5.
    M. L. Antonie and O. R. Zaïane. Text Document Categorization by Term Association. In Proceedings of the 2002 IEEE International Conference on Data Mining, ICDM ’02, pages 19–26, Washington, DC, USA, 2002. IEEE Computer Society.Google Scholar
  6. 6.
    M. Atzmueller. Subgroup Discovery - Advanced Review. WIREs: Data Mining and Knowledge Discovery, 5:35–49, 2015.Google Scholar
  7. 7.
    M. Atzmueller and F. Puppe. SD-Map – A Fast Algorithm for Exhaustive Subgroup Discovery. In Proceedings of the 10th European Symposium on Principles of Data Mining and Knowledge Discovery, PKDD ’06, pages 6–17, Berlin, Germany, 2006.Google Scholar
  8. 8.
    S. D. Bay and M. J. Pazzani. Detecting Group Differences: Mining Contrast Sets. Data Mining and Knowledge Discovery, 5(3):213–246, 2001.CrossRefzbMATHGoogle Scholar
  9. 9.
    M. Boley and H. Grosskreutz. Non-redundant subgroup discovery using a closure system. In Proceedings of the 2009 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML/PKDD 2009, pages 179–194, Bled, Slovenia, September 2009. Springer.Google Scholar
  10. 10.
    O. Bousquet, U. Luxburg, and G. Ratsch. Advanced Lectures On Machine Learning. SpringerVerlag, 2004.CrossRefzbMATHGoogle Scholar
  11. 11.
    C. J. Carmona, P. González, M. J. del Jesus, and F. Herrera. NMEEF-SD: Non-dominated multiobjective evolutionary algorithm for extracting fuzzy rules in subgroup discovery. IEEE Transactions on Fuzzy Systems, 18(5):958–970, 2010.CrossRefGoogle Scholar
  12. 12.
    C. J. Carmona, P. González, M. J. del Jesus, M. Navío-Acosta, and L. Jimënez-Trevino. Evolutionary fuzzy rule extraction for subgroup discovery in a psychiatric emergency department. Soft Computing, 15(12):2435–2448, 2011.CrossRefGoogle Scholar
  13. 13.
    C. J. Carmona, P. González, M. J. del Jesus, and F. Herrera. Overview on evolutionary subgroup discovery: analysis of the suitability and potential of the search performed by evolutionary algorithms. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 4(2): 87–103, 2014.Google Scholar
  14. 14.
    P. Clark and T. Niblett. The cn2 induction algorithm. Machine Learning, 3(4):261–283, 1989.Google Scholar
  15. 15.
    C. A. Coello, G. B. Lamont, and D. A. Van Veldhuizen. Evolutionary Algorithms for Solving Multi-Objective Problems (Genetic and Evolutionary Computation). Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2006.zbMATHGoogle Scholar
  16. 16.
    K. Deb, A. Pratap, S. Agrawal, and T. Meyarivan. A Fast Elitist Multi-Objective Genetic Algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6:182–197, 2000.CrossRefGoogle Scholar
  17. 17.
    M. J. del Jesus, P. González, F. Herrera, and M. Mesonero. Evolutionary fuzzy rule induction process for subgroup discovery: A case study in marketing. IEEE Transactions on Fuzzy Systems, 15(4):578–592, 2007.CrossRefGoogle Scholar
  18. 18.
    G. Dong and J. Li. Efficient mining of emerging patterns: Discovering trends and differences. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’99, pages 43–52, New York, NY, USA, 1999.Google Scholar
  19. 19.
    G. Dong and J. Li. Emerging patterns. In L. Liu and M. T. Özsu, editors, Encyclopedia of Database Systems, pages 985–988. Springer US, 2009.Google Scholar
  20. 20.
    W. Duivesteijn and A. J. Knobbe. Exploiting false discoveries - statistical validation of patterns and quality measures in subgroup discovery. In Proceedings of the 11th IEEE International Conference on Data Mining, ICDM 2011, pages 151–160, Vacouver, BC, Canada, December 2011.Google Scholar
  21. 21.
    W. Duivesteijn, A. J. Knobbe, A. Feelders, and M. van Leeuwen. Subgroup discovery meets Bayesian networks – an exceptional model mining approach. In Proceedings of the 2010 IEEE International Conference on Data Mining, ICDM 2010, pages 158–167, Sydney, Australia, December 2010. IEEE Computer Society.Google Scholar
  22. 22.
    D. Dumitrescu, B. Lazzerini, L. C. Jain, and A. Dumitrescu. Evolutionary Computation. CRC Press, Inc., Boca Raton, FL, USA, 2000.zbMATHGoogle Scholar
  23. 23.
    H. Fan and K. Ramamohanarao. Efficiently mining interesting emerging patterns. In G. Dong, C. Tang, and W. Wang, editors, Advances in Web-Age Information Management, pages 189–201. Springer Berlin Heidelberg, 2003.CrossRefGoogle Scholar
  24. 24.
    ssss D. Gamberger and N. Lavrac. Expert-guided subgroup discovery: Methodology and application. Journal of Artificial Intelligence Research, 17:501–527, 2002.Google Scholar
  25. 25.
    P. González-Espejo, S. Ventura, and F. Herrera. A Survey on the Application of Genetic Programming to Classification. IEEE Transactions on Systems, Man and Cybernetics: Part C, 40(2):121–144, 2010.CrossRefGoogle Scholar
  26. 26.
    H. Grosskreutz and S. Ruping. On subgroup discovery in numerical domains. Data Mining and Knowledge Discovery, 19(2):210–226, 2009.MathSciNetCrossRefGoogle Scholar
  27. 27.
    J. Han and M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann, 2000.zbMATHGoogle Scholar
  28. 28.
    J. Han, J. Pei, Y. Yin, and R. Mao. Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach. Data Mining and Knowledge Discovery, 8:53–87, 2004.MathSciNetCrossRefGoogle Scholar
  29. 29.
    F. Herrera, C. J. Carmona, P. González, and M. J. del Jesus. An overview on subgroup discovery: Foundations and applications. Knowledge and Information Systems, 29(3):495–525, 2011.CrossRefGoogle Scholar
  30. 30.
    R. J. Hilderman and T. Peckham. A statistically sound alternative approach to mining contrast sets. In Proceedings of the 4th Australasian Data Mining Conference, AusDM 2005, pages 157–172, Sydney, Australia, 2005.Google Scholar
  31. 31.
    Viktor Jovanoski and Nada Lavrač. Classification rule learning with APRIORI-C. In Proceedings of the 10th Portuguese Conference on Artificial Intelligence on Progress in Artificial Intelligence, Knowledge Extraction, Multi-agent Systems, Logic Programming and Constraint Solving, EPIA ’01, pages 44–51, London, UK, 2001. Springer-Verlag.Google Scholar
  32. 32.
    B. Kavsek and N. Lavrač. APRIORI-SD: Adapting association rule learning to subgroup discovery. Applied Artificial Intelligence, 20(7):543–583, 2006.CrossRefGoogle Scholar
  33. 33.
    W. Kloesgen and M. May. Census data mining an application. In In Proceedings of the 6th European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD 2002, pages 733–739, Helsinki, Finland, 2002. Springer-Verlag London.Google Scholar
  34. 34.
    W. Klösgen. Explora: A multipattern and multistrategy discovery assistant. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pages 249–271. American Association for Artificial Intelligence, 1996.Google Scholar
  35. 35.
    N. Lavrač, B. Kavšek, P. Flach, and L. Todorovski. Subgroup discovery with cn2-sd. Journal of Machine Learning Research, 5:153–188, December 2004.Google Scholar
  36. 36.
    D. Leman, A. Feelders, and A. J. Knobbe. Exceptional model mining. In Proceedings of the European Conference in Machine Learning and Knowledge Discovery in Databases, volume 5212 of ECML/PKDD 2008, pages 1–16, Antwerp, Belgium, 2008. Springer.Google Scholar
  37. 37.
    J. Li and L. Wong. Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns. Bioinformatics, 18(5):725–734, 2002.CrossRefGoogle Scholar
  38. 38.
    J. Lin and E. J. Keogh. Extending the notion of contrast sets to time series and multimedia data. In Proceedings of the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD 2006, pages 284–296, Berlin, Germany, 2006.Google Scholar
  39. 39.
    J. M. Luna, J. R. Romero, C. Romero, and S. Ventura. On the use of genetic programming for mining comprehensible rules in subgroup discovery. IEEE Transactions on Cybernetics, 44(12):2329–2341, 2014.CrossRefGoogle Scholar
  40. 40.
    R. McKay, N. Hoai, P. Whigham, Y. Shan, and M. O’Neill. Grammar-based Genetic Programming: a Survey. Genetic Programming and Evolvable Machines, 11:365–396, 2010.CrossRefGoogle Scholar
  41. 41.
    K. Moreland and K. Truemper. Discretization of target attributes for subgroup discovery. In Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2009, pages 44–52, Leipzig, Germany, 2009. Springer.Google Scholar
  42. 42.
    M. Mueller, R. Rosales, H. Steck, S. Krishnan, B. Rao, and S. Kramer. Subgroup discovery for test selection: A novel approach and its application to breast cancer diagnosis. In N. Adams, C. Robardet, A. Siebes, and J. F. Boulicaut, editors, Advances in Intelligent Data Analysis VIII, volume 5772 of Lecture Notes in Computer Science, pages 119–130. Springer Berlin Heidelberg, 2009.Google Scholar
  43. 43.
    P. K. Novak, N. Lavrač, and G. I. Webb. Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining. Journal of Machine Learning Research, 10:377–403, 2009.zbMATHGoogle Scholar
  44. 44.
    V. Pachón, J. Mata, J. L. Domínguez, and M. J. Maña. A multi-objective evolutionary approach for subgroup discovery. In Proceedings of the 5th International Conference on Hybrid Artificial Intelligence Systems, HAIS 2010, pages 271–278, San Sebastian, Spain, 2010. Springer.Google Scholar
  45. 45.
    D. Rodriguez, R. Ruiz, J. C. Riquelme, and J. S. Aguilar-Ruiz. Searching for rules to detect defective modules: A subgroup discovery approach. Information Sciences, 191:14–30, 2012.CrossRefGoogle Scholar
  46. 46.
    P. N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining. Addison Wesley, 2005.Google Scholar
  47. 47.
    T. T. Wong and K. L. Tseng. Mining negative contrast sets from data with discrete attributes. Expert Systems with Applications, 29(2):401–407, 2005.CrossRefGoogle Scholar
  48. 48.
    S. Wrobel. An algorithm for multi-relational discovery of subgroups. In Proceedings of the 1st European Symposium on Principles of Data Mining and Knowledge Discovery, PKDD ’97, pages 78–87, London, UK, UK, 1997. Springer-Verlag.Google Scholar
  49. 49.
    L. A. Zadeh. The concept of a linguistic variable and its application to approximate reasoning I,II,III. Information Sciences, 8–9:199–249, 301–357, 43–80, 1975.Google Scholar
  50. 50.
    A. Zimmermann and S. Nijssen. Supervised pattern mining and applications to classification. In C. C. Aggarwal and J. Han, editors, Frequent Pattern Mining, pages 425–442. Springer International Publishing, 2014.Google Scholar
  51. 51.
    A. Zimmermann, B. Bringmann, and R. Ulrich. Fast, effective molecular feature mining by local optimization. In Proceedings of the 2010 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML/PKDD 2010, pages 563–578, Barcelona, Spain, 2010. Springer.Google Scholar
  52. 52.
    E. Zitzler, M. Laumanns, and L. Thiele. SPEA2: Improving the Strength Pareto Evolutionary Algorithm for Multiobjective Optimization. In Proceedings of the 2001 conference on Evolutionary Methods for Design, Optimisation and Control with Application to Industrial Problems, EUROGEN 2001, pages 95–100, Athens, Greece, 2001.Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Sebastián Ventura
    • 1
  • José María Luna
    • 1
  1. 1.Department of Computer Science and Numerical AnalysisUniversity of CordobaCordobaSpain

Personalised recommendations