Supervised Local Pattern Mining

Ventura, Sebastián; Luna, José María

doi:10.1007/978-3-319-33858-3_7

Sebastián Ventura³ &
José María Luna³

1229 Accesses
1 Citations

Abstract

Pattern mining is considered as a really interesting task for the extraction of hidden knowledge in the form of patterns. The extraction of such subsequences, substructures or itemsets that represent any type of homogeneity and regularity in data has been carried out from unlabeled data. However, there are many research areas that aim at discovering patterns in the form of rules induced from labeled data. Hence, it is interesting to discover patterns and associations from a supervised point of view since a single item or a set of them can be considered as distinctive. This task, which is known as supervised local pattern mining, is described in this chapter, including different areas such as contrast set mining, emerging pattern mining, and subgroup discovery. This chapter is mainly focused on the subgroup discovery task, which is widely known in the field of supervised local pattern mining. Here, an exhaustive description about this task is provided, including some important quality measures in the field. Then, this chapter includes different evolutionary approaches for subgroup discovery. Finally, this chapter includes an analysis of other different approaches proposed for mining and quantifying local patterns.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

T. Abudawood and P. Flach. Evaluation measures for multi-class subgroup discovery. In W. Buntine, M. Grobelnik, D. Mladenić, and J. Shawe-Taylor, editors, Machine Learning and Knowledge Discovery in Databases, volume 5781 of Lecture Notes in Computer Science, pages 35–50. Springer Berlin Heidelberg, 2009.
Chapter Google Scholar
C. C. Aggarwal and J. Han. Frequent Pattern Mining. Springer International Publishing, 2014.
Book MATH Google Scholar
R. Agrawal, T. Imielinski, and A. N. Swami. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD Conference ’93, pages 207–216, Washington, DC, USA, 1993.
Google Scholar
J. Alípio, F. Pereira, and P. J. Azevedo. Visual interactive subgroup discovery with numerical properties of interest. In L. Todorovski, N. Lavrač, and K. Jantke, editors, Discovery Science, volume 4265 of Lecture Notes in Computer Science, pages 301–305. Springer Berlin Heidelberg, 2006.
Chapter Google Scholar
M. L. Antonie and O. R. Zaïane. Text Document Categorization by Term Association. In Proceedings of the 2002 IEEE International Conference on Data Mining, ICDM ’02, pages 19–26, Washington, DC, USA, 2002. IEEE Computer Society.
Google Scholar
M. Atzmueller. Subgroup Discovery - Advanced Review. WIREs: Data Mining and Knowledge Discovery, 5:35–49, 2015.
Google Scholar
M. Atzmueller and F. Puppe. SD-Map – A Fast Algorithm for Exhaustive Subgroup Discovery. In Proceedings of the 10th European Symposium on Principles of Data Mining and Knowledge Discovery, PKDD ’06, pages 6–17, Berlin, Germany, 2006.
Google Scholar
S. D. Bay and M. J. Pazzani. Detecting Group Differences: Mining Contrast Sets. Data Mining and Knowledge Discovery, 5(3):213–246, 2001.
Article MATH Google Scholar
M. Boley and H. Grosskreutz. Non-redundant subgroup discovery using a closure system. In Proceedings of the 2009 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML/PKDD 2009, pages 179–194, Bled, Slovenia, September 2009. Springer.
Google Scholar
O. Bousquet, U. Luxburg, and G. Ratsch. Advanced Lectures On Machine Learning. SpringerVerlag, 2004.
Book MATH Google Scholar
C. J. Carmona, P. González, M. J. del Jesus, and F. Herrera. NMEEF-SD: Non-dominated multiobjective evolutionary algorithm for extracting fuzzy rules in subgroup discovery. IEEE Transactions on Fuzzy Systems, 18(5):958–970, 2010.
Article Google Scholar
C. J. Carmona, P. González, M. J. del Jesus, M. Navío-Acosta, and L. Jimënez-Trevino. Evolutionary fuzzy rule extraction for subgroup discovery in a psychiatric emergency department. Soft Computing, 15(12):2435–2448, 2011.
Article Google Scholar
C. J. Carmona, P. González, M. J. del Jesus, and F. Herrera. Overview on evolutionary subgroup discovery: analysis of the suitability and potential of the search performed by evolutionary algorithms. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 4(2): 87–103, 2014.
Google Scholar
P. Clark and T. Niblett. The cn2 induction algorithm. Machine Learning, 3(4):261–283, 1989.
Google Scholar
C. A. Coello, G. B. Lamont, and D. A. Van Veldhuizen. Evolutionary Algorithms for Solving Multi-Objective Problems (Genetic and Evolutionary Computation). Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2006.
MATH Google Scholar
K. Deb, A. Pratap, S. Agrawal, and T. Meyarivan. A Fast Elitist Multi-Objective Genetic Algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6:182–197, 2000.
Article Google Scholar
M. J. del Jesus, P. González, F. Herrera, and M. Mesonero. Evolutionary fuzzy rule induction process for subgroup discovery: A case study in marketing. IEEE Transactions on Fuzzy Systems, 15(4):578–592, 2007.
Article Google Scholar
G. Dong and J. Li. Efficient mining of emerging patterns: Discovering trends and differences. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’99, pages 43–52, New York, NY, USA, 1999.
Google Scholar
G. Dong and J. Li. Emerging patterns. In L. Liu and M. T. Özsu, editors, Encyclopedia of Database Systems, pages 985–988. Springer US, 2009.
Google Scholar
W. Duivesteijn and A. J. Knobbe. Exploiting false discoveries - statistical validation of patterns and quality measures in subgroup discovery. In Proceedings of the 11th IEEE International Conference on Data Mining, ICDM 2011, pages 151–160, Vacouver, BC, Canada, December 2011.
Google Scholar
W. Duivesteijn, A. J. Knobbe, A. Feelders, and M. van Leeuwen. Subgroup discovery meets Bayesian networks – an exceptional model mining approach. In Proceedings of the 2010 IEEE International Conference on Data Mining, ICDM 2010, pages 158–167, Sydney, Australia, December 2010. IEEE Computer Society.
Google Scholar
D. Dumitrescu, B. Lazzerini, L. C. Jain, and A. Dumitrescu. Evolutionary Computation. CRC Press, Inc., Boca Raton, FL, USA, 2000.
MATH Google Scholar
H. Fan and K. Ramamohanarao. Efficiently mining interesting emerging patterns. In G. Dong, C. Tang, and W. Wang, editors, Advances in Web-Age Information Management, pages 189–201. Springer Berlin Heidelberg, 2003.
Chapter Google Scholar
ssss D. Gamberger and N. Lavrac. Expert-guided subgroup discovery: Methodology and application. Journal of Artificial Intelligence Research, 17:501–527, 2002.
Google Scholar
P. González-Espejo, S. Ventura, and F. Herrera. A Survey on the Application of Genetic Programming to Classification. IEEE Transactions on Systems, Man and Cybernetics: Part C, 40(2):121–144, 2010.
Article Google Scholar
H. Grosskreutz and S. Ruping. On subgroup discovery in numerical domains. Data Mining and Knowledge Discovery, 19(2):210–226, 2009.
Article MathSciNet Google Scholar
J. Han and M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann, 2000.
MATH Google Scholar
J. Han, J. Pei, Y. Yin, and R. Mao. Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach. Data Mining and Knowledge Discovery, 8:53–87, 2004.
Article MathSciNet Google Scholar
F. Herrera, C. J. Carmona, P. González, and M. J. del Jesus. An overview on subgroup discovery: Foundations and applications. Knowledge and Information Systems, 29(3):495–525, 2011.
Article Google Scholar
R. J. Hilderman and T. Peckham. A statistically sound alternative approach to mining contrast sets. In Proceedings of the 4th Australasian Data Mining Conference, AusDM 2005, pages 157–172, Sydney, Australia, 2005.
Google Scholar
Viktor Jovanoski and Nada Lavrač. Classification rule learning with APRIORI-C. In Proceedings of the 10th Portuguese Conference on Artificial Intelligence on Progress in Artificial Intelligence, Knowledge Extraction, Multi-agent Systems, Logic Programming and Constraint Solving, EPIA ’01, pages 44–51, London, UK, 2001. Springer-Verlag.
Google Scholar
B. Kavsek and N. Lavrač. APRIORI-SD: Adapting association rule learning to subgroup discovery. Applied Artificial Intelligence, 20(7):543–583, 2006.
Article Google Scholar
W. Kloesgen and M. May. Census data mining an application. In In Proceedings of the 6th European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD 2002, pages 733–739, Helsinki, Finland, 2002. Springer-Verlag London.
Google Scholar
W. Klösgen. Explora: A multipattern and multistrategy discovery assistant. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pages 249–271. American Association for Artificial Intelligence, 1996.
Google Scholar
N. Lavrač, B. Kavšek, P. Flach, and L. Todorovski. Subgroup discovery with cn2-sd. Journal of Machine Learning Research, 5:153–188, December 2004.
Google Scholar
D. Leman, A. Feelders, and A. J. Knobbe. Exceptional model mining. In Proceedings of the European Conference in Machine Learning and Knowledge Discovery in Databases, volume 5212 of ECML/PKDD 2008, pages 1–16, Antwerp, Belgium, 2008. Springer.
Google Scholar
J. Li and L. Wong. Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns. Bioinformatics, 18(5):725–734, 2002.
Article Google Scholar
J. Lin and E. J. Keogh. Extending the notion of contrast sets to time series and multimedia data. In Proceedings of the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD 2006, pages 284–296, Berlin, Germany, 2006.
Google Scholar
J. M. Luna, J. R. Romero, C. Romero, and S. Ventura. On the use of genetic programming for mining comprehensible rules in subgroup discovery. IEEE Transactions on Cybernetics, 44(12):2329–2341, 2014.
Article Google Scholar
R. McKay, N. Hoai, P. Whigham, Y. Shan, and M. O’Neill. Grammar-based Genetic Programming: a Survey. Genetic Programming and Evolvable Machines, 11:365–396, 2010.
Article Google Scholar
K. Moreland and K. Truemper. Discretization of target attributes for subgroup discovery. In Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2009, pages 44–52, Leipzig, Germany, 2009. Springer.
Google Scholar
M. Mueller, R. Rosales, H. Steck, S. Krishnan, B. Rao, and S. Kramer. Subgroup discovery for test selection: A novel approach and its application to breast cancer diagnosis. In N. Adams, C. Robardet, A. Siebes, and J. F. Boulicaut, editors, Advances in Intelligent Data Analysis VIII, volume 5772 of Lecture Notes in Computer Science, pages 119–130. Springer Berlin Heidelberg, 2009.
Google Scholar
P. K. Novak, N. Lavrač, and G. I. Webb. Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining. Journal of Machine Learning Research, 10:377–403, 2009.
MATH Google Scholar
V. Pachón, J. Mata, J. L. Domínguez, and M. J. Maña. A multi-objective evolutionary approach for subgroup discovery. In Proceedings of the 5th International Conference on Hybrid Artificial Intelligence Systems, HAIS 2010, pages 271–278, San Sebastian, Spain, 2010. Springer.
Google Scholar
D. Rodriguez, R. Ruiz, J. C. Riquelme, and J. S. Aguilar-Ruiz. Searching for rules to detect defective modules: A subgroup discovery approach. Information Sciences, 191:14–30, 2012.
Article Google Scholar
P. N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining. Addison Wesley, 2005.
Google Scholar
T. T. Wong and K. L. Tseng. Mining negative contrast sets from data with discrete attributes. Expert Systems with Applications, 29(2):401–407, 2005.
Article Google Scholar
S. Wrobel. An algorithm for multi-relational discovery of subgroups. In Proceedings of the 1st European Symposium on Principles of Data Mining and Knowledge Discovery, PKDD ’97, pages 78–87, London, UK, UK, 1997. Springer-Verlag.
Google Scholar
L. A. Zadeh. The concept of a linguistic variable and its application to approximate reasoning I,II,III. Information Sciences, 8–9:199–249, 301–357, 43–80, 1975.
Google Scholar
A. Zimmermann and S. Nijssen. Supervised pattern mining and applications to classification. In C. C. Aggarwal and J. Han, editors, Frequent Pattern Mining, pages 425–442. Springer International Publishing, 2014.
Google Scholar
A. Zimmermann, B. Bringmann, and R. Ulrich. Fast, effective molecular feature mining by local optimization. In Proceedings of the 2010 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML/PKDD 2010, pages 563–578, Barcelona, Spain, 2010. Springer.
Google Scholar
E. Zitzler, M. Laumanns, and L. Thiele. SPEA2: Improving the Strength Pareto Evolutionary Algorithm for Multiobjective Optimization. In Proceedings of the 2001 conference on Evolutionary Methods for Design, Optimisation and Control with Application to Industrial Problems, EUROGEN 2001, pages 95–100, Athens, Greece, 2001.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Numerical Analysis, University of Cordoba, Cordoba, Spain
Sebastián Ventura & José María Luna

Authors

Sebastián Ventura
View author publications
You can also search for this author in PubMed Google Scholar
José María Luna
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Ventura, S., Luna, J.M. (2016). Supervised Local Pattern Mining. In: Pattern Mining with Evolutionary Algorithms. Springer, Cham. https://doi.org/10.1007/978-3-319-33858-3_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-33858-3_7
Published: 14 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-33857-6
Online ISBN: 978-3-319-33858-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics