Abstract
Discriminative pattern mining looks for association patterns that occur more frequently in one class than another and has important applications in many areas including finding biomarkers in biomedical data. However, finding such patterns is challenging because higher order combinations of variables may show high discrimination even when single variables or lower-order combinations show little or no discrimination. Thus, generating such patterns is important for evaluating discriminative pattern mining algorithms and better understanding the nature of discriminative patterns. To that end, we describe how such patterns can be defined using mathematical constraints which are then solved with widely available software that generates solutions for the resulting optimization problem. We present a basic formulation of the problem obtained from a straightforward translation of the desired pattern characteristics into mathematical constraints, and then show how the pattern generation problem can be reformulated in terms of the selection of rows from a truth table. This formulation is more efficient and provides deeper insight into the process of creating higher order patterns. It also makes it easy to define patterns other than just those based on the conjunctive logic used by traditional association and discriminant pattern analysis.
Keywords
This work was supported by NSF grant IIS-0916439. Computing resources were provided by the Minnesota Supercomputing Institute.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB 1994: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499. Morgan Kaufmann Publishers Inc., San Francisco (1994)
Bay, S., Pazzani, M.: Detecting group differences: Mining contrast sets. Data Mining and Knowledge Discovery 5(3), 213–246 (2001)
Cheng, H., Yan, X., Han, J., Hsu, C.-W.: Discriminative frequent pattern analysis for effective classification. In: Proceedings of International Conference on Data Engineering, pp. 716–725 (2007)
Dong, G., Li, J.: Efficient mining of emerging paterns: Discovering trends and differences. In: Proceedings of the 2001 ACM SIGKDD International Conference on Knowledge Discovery in Databases, pp. 43–52 (1999)
Fan, W., Zhang, K., Cheng, H., Gao, J., Yan, X., Han, J., Yu, P., Verscheure, O.: Direct mining of discriminative and essential frequent patterns via model-based search tree. In: Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2008, pp. 230–238. ACM, New York (2008)
Fang, G., Pandey, G., Wang, W., Gupta, M., Steinbach, M., Kumar, V.: Mining low-support discriminative patterns from dense and high-dimensional data. IEEE Transactions on Knowledge and Data Engineering (2010) (in press)
Fourer, R., Gay, D.M., Kernighan, B.W.: A modeling language for mathematical programming. Manage. Sci. 36(5), 519–554 (1990)
Geoffrey, D.A.N., Webb, I., Butler, S.M.: On detecting differences between groups. In: Proceeding of the ACM SIGKDD International Conference on Knowledge Discovery in Databases, pp. 256–265 (2003)
Li, J., Dong, G., Ramamohanarao, K.: Making use of the most expressive jumping emerging patterns for classification. Knowledge and Information systems 3(2), 131–145 (2001)
Li, J., Liu, G., Wong, L.: Mining statistically important equivalence classes and delta-discriminative emerging patterns. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 430–439. ACM, New York (2007)
Loekito, E., Bailey, J.: Fast mining of high dimensional expressive contrast patterns using zero-suppressed binary decision diagrams. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 307–316. ACM, New York (2006)
Morishita, S., Sese, J.: Transversing itemset lattices with statistical metric pruning. In: Proceedings of the Nineteenth ACM Symposium on Principles of Database Systems, pp. 226–236. ACM, New York (2000)
Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to data mining. Addison-Wesley, Reading (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Steinbach, M., Yu, H., Fang, G., Kumar, V. (2011). Using Constraints to Generate and Explore Higher Order Discriminative Patterns. In: Huang, J.Z., Cao, L., Srivastava, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2011. Lecture Notes in Computer Science(), vol 6634. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20841-6_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-20841-6_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20840-9
Online ISBN: 978-3-642-20841-6
eBook Packages: Computer ScienceComputer Science (R0)