Abstract
Finding functions whose accuracies change significantly between two classes is an interesting work. In this paper, this kind of functions is defined as class contrast functions. As Gene Expression Programming (GEP) can discover essential relations from data and express them mathematically, it is desirable to apply GEP to mining such class contrast functions from data. The main contributions of this paper include: (1) proposing a new data mining task – class contrast function mining, (2) designing a GEP based method to find class contrast functions, (3) presenting several strategies for finding multiple class contrast functions in data, (4) giving an extensive performance study on both synthetic and real world datasets. The experimental results show that the proposed methods are effective. Several class contrast functions are discovered from the real world datasets. Some potential works on class contrast function mining are discussed based on the experimental results.
This work was supported by the National Natural Science Foundation of China under grant No. 60773169, and the 11th Five Years Key Programs for Sci. &Tech. Development of China under grant No. 2006BAI05A01.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Dong, G., Li, J.: Efficient Mining of Emerging Patterns: Discovering Trends and Differences. In: Proc. of KDD 1999, pp. 43–52 (1999)
Dong, G., Li, J.: Mining Border Descriptions of Emerging Patterns from Dataset Pairs. Knowl. Inf. Syst. 8(2), 178–202 (2005)
Zhang, X., Dong, G., Ramamohanarao, K.: Exploring Constraints to Efficiently Mine Emerging Patterns from Large High-dimensional Datasets. In: Proc. of KDD 2000, pp. 310–314 (2000)
Bailey, J., Manoukian, T., Ramamohanarao, K.: Fast Algorithms for Mining Emerging Patterns. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS, vol. 2431, pp. 39–50. Springer, Heidelberg (2002)
Fan, H., Ramamohanarao, K.: An Efficient Single-Scan Algorithm for Mining Essential Jumping Emerging Patterns for Classification. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS, vol. 2336, pp. 456–462. Springer, Heidelberg (2002)
Bailey, J., Manoukian, T., Ramamohanarao, K.: A Fast Algorithm for Computing Hypergraph Transversals and its Application in Mining Emerging Patterns. In: Proc. of ICDM 2003, pp. 485–488 (2003)
Loekito, E., Bailey, J.: Fast Mining of High Dimensional Expressive Contrast Patterns Using Zero-suppressed Binary Decision Diagrams. In: Proc. of KDD 2006, pp. 307–316 (2006)
Li, J., Liu, G., Wong, L.: Mining Statistically Important Equivalence Classes and Delta-Discriminative Emerging Patterns. In: Proc. of KDD 2007, pp. 430–439 (2007)
Ferreira, C.: Gene Expression Programming: A New Adaptive Algorithm for Solving Problems. Complex Systems 13(2), 87–129 (2001)
Ferreira, C.: Gene Expression Programming: Mathematical Modeling by an Artificial Intelligence. Angra do Heroismo, Portugal (2002)
Ferreira, C.: Discovery of the Boolean Functions to the Best Density-Classification Rules Using Gene Expression Programming. In: Proc of the 4th EuroGP, pp. 51–60 (2002)
Ferreira, C.: Mutation, Transposition, and recombination: An analysis of the evolutionary Dynamics. In: 4th Int’l Workshop on Frontiers in Evolutionary Algorithms, Research Triangle Park, North Carolina, USA, pp. 614–617 (2002)
Lopes, H.S., Weinert, W.R.: EGIPSYS: An Enhanced Gene Expression Programming Approach for Symbolic Regression Problems. Int’l Journal of Applied Mathematics and Computer Science 14(3), 375–384 (2004)
Zhou, C., Xiao, W., Tirpak, T.M., Nelson, P.C.: Evolution Accurate and Compact Classification Rules with Gene Expression Programming. IEEE Transactions on Evolutionary Computation 7(6), 519–531 (2003)
Duan, L., Tang, C., Zhang, T., et al.: Distance Guided Classification with Gene Expression Programming. In: Li, X., Zaïane, O.R., Li, Z.-h. (eds.) ADMA 2006. LNCS, vol. 4093, pp. 239–246. Springer, Heidelberg (2006)
Zuo, J., Tang, C., Li, C., et al.: Time Series Prediction based on Gene Expression Programming. In: Li, Q., Wang, G., Feng, L. (eds.) WAIM 2004. LNCS, vol. 3129, pp. 55–64. Springer, Heidelberg (2004)
Bailey, J., Dong, G.: Contrast Data Mining: Methods and Applications. In: Tutorial at 2007 IEEE ICDM (2007)
Dong, G., Zhang, X., Wong, L., Li, J.: CAEP: Classification by Aggregating Emerging Patterns. Discovery Science, 30–42 (1999)
Li, J., Dong, G., Ramamohanarao, K.: Instance-Based Classification by Emerging Patterns. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS, vol. 1910, pp. 191–200. Springer, Heidelberg (2000)
Li, J., Dong, G., Ramamohanarao, K., Wong, L.: DeEPs: A New Instance-Based Lazy Discovery and Classification System. Machine Learning 54(2), 99–124 (2004)
Li, J., Dong, G., Ramamohanarao, K.: Making Use of the Most Expressive Jumping Emerging Patterns for Classification. In: Terano, T., Chen, A.L.P. (eds.) PAKDD 2000. LNCS, vol. 1805, pp. 220–232. Springer, Heidelberg (2000)
Dougherty, J., Kohavi, R., Sahami, M.: Supervised and Unsupervised Discretization of Continuous Features. In: Proc. of the 20th ICML, pp. 194–202 (1995)
Fayyad, U., Irani, K.: Multi-interval Discretization of Continuous-valued Attributes for Classification Learning. In: Proc. of the 13th IJCAI, pp. 1022–1029 (1993)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Duan, L., Tang, C., Tang, L., Zhang, T., Zuo, J. (2009). Mining Class Contrast Functions by Gene Expression Programming. In: Huang, R., Yang, Q., Pei, J., Gama, J., Meng, X., Li, X. (eds) Advanced Data Mining and Applications. ADMA 2009. Lecture Notes in Computer Science(), vol 5678. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03348-3_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-03348-3_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03347-6
Online ISBN: 978-3-642-03348-3
eBook Packages: Computer ScienceComputer Science (R0)