Gene Expression Programming Ensemble for Classifying Big Datasets

Jȩdrzejowicz, Joanna; Jȩdrzejowicz, Piotr

doi:10.1007/978-3-319-67077-5_1

Gene Expression Programming Ensemble for Classifying Big Datasets

Joanna Jȩdrzejowicz¹⁸ &
Piotr Jȩdrzejowicz¹⁹

Conference paper
First Online: 07 September 2017

1846 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10449))

Abstract

The paper proposes a new GEP-based batch ensemble classifier constructed using the stacked generalization concept. In our approach combination of base classifiers involves evolving the meta-gene using genes induced by GEP from randomly generated combinations of instances with randomly selected subsets of attributes. The main property of the discussed classifier is its scalability allowing adaptation to the size of the dataset under consideration. To validate the proposed classifier, we have carried-out computational experiment involving a number of publicly available benchmark datasets. Experiment results show that the approach assures good performance, scalability and robustness.

This is a preview of subscription content, log in via an institution.

References

Álvarez, A., Sierra, B., Arruti, A., Gil, J.M.L., Garay-Vitoria, N.: Classifier subset selection for the stacked generalization method applied to emotion recognition in speech. Sensors 16(1), 21 (2016)
Article Google Scholar
Awwalu, J., Ghazvini, A., Bakar, A.A.: Comparative analysis of algorithms in supervised classification: a case study of bank notes dataset. Int. J. Comput. Trends Technol. 17(1), 38–43 (2014)
Google Scholar
Ávila-Jiménez, J.L., Gibaja Galindo, E.L., Zafra, A., Ventura, S.: A gene expression programming algorithm for multi-label classification. Multiple-Valued Logic Soft Comput. 17(2–3), 183–206 (2011)
Google Scholar
Crain, K., Davis, G.: Classifying forest cover type using cartographic features. Stanford University (2014)
Google Scholar
Ferreira, C.: Gene expression programming: a new adaptive algorithm for solving problems. CoRR, cs.AI/0102027 (2001)
Google Scholar
Ferreira, C.: Gene Expression Programming: Mathematical Modeling by an Artificial Intelligence. Studies in Computational Intelligence, vol. 21. Springer, Heidelberg (2006). doi:10.1007/3-540-32849-1
Book MATH Google Scholar
Hosseini, S.A., Rabiee, H.R., Hafez, H., Soltani-Farani, A.: Classifying a stream of infinite concepts: a Bayesian non-parametric approach. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS, vol. 8724, pp. 1–16. Springer, Heidelberg (2014). doi:10.1007/978-3-662-44848-9_1
Chapter Google Scholar
Jȩdrzejowicz, J., Jȩdrzejowicz, P.: GEP-induced expression trees as weak classifiers. In: Perner, P. (ed.) ICDM 2008. LNCS, vol. 5077, pp. 129–141. Springer, Heidelberg (2008). doi:10.1007/978-3-540-70720-2_10
Chapter Google Scholar
Jȩdrzejowicz, J., Jȩdrzejowicz, P.: A family of GEP-induced ensemble classifiers. In: Nguyen, N.T., Kowalczyk, R., Chen, S.-M. (eds.) ICCCI 2009. LNCS, vol. 5796, pp. 641–652. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04441-0_56
Chapter Google Scholar
Jȩdrzejowicz, J., Jȩdrzejowicz, P.: Experimental evaluation of two new GEP-based ensemble classifiers. Expert Syst. Appl. 38(9), 10932–10939 (2011)
Article Google Scholar
Jȩdrzejowicz, J., Jȩdrzejowicz, P.: Combining expression trees. In: 2013 IEEE International Conference on Cybernetics, CYBCONF 2013, Lausanne, Switzerland, 13–15 June 2013, pp. 80–85. IEEE (2013)
Google Scholar
Johnson, B.A., Tateishi, R., Thanh, H.N.: A hybrid pansharpening approach and multiscale object-based image analysis for mapping diseased pine and oak trees. Int. J. Remote Sens. 34(20), 6969–6982 (2013)
Article Google Scholar
Karakasis, V., Stafylopatis, A.: Data mining based on gene expression programming and Clonal selection. In: IEEE International Conference on Evolutionary Computation, CEC 2006, part of WCCI 2006, Vancouver, BC, Canada, 16–21 July 2006, pp. 514–521. IEEE (2006)
Google Scholar
Koc, A.A., Yeniay, O.: A comparative study of artificial neural networks and logistic regression for classification of marketing campaign results. Math. Comput. Appl. 18(3), 392–398 (2013)
Google Scholar
Li, X., Zhou, C., Xiao, W., Nelson, P.C.: Prefix gene expression programming. In: Rothlauf, F. (ed.) Late Breaking Paper at Genetic and Evolutionary Computation Conference (GECCO 2005), Washington, D.C., USA, pp. 25–29, June 2005
Google Scholar
Lichman, M.: UCI machine learning repository (2013)
Google Scholar
Liu, S., Liu, Z., Sun, J., Liu, L.: Application of synergetic neural network in online writeprint identification. Int. J. Digit. Content Technol. Appl. 5(3), 126–135 (2011)
Article MathSciNet Google Scholar
Mertayak, C.: Utilization of dimensionality reduction in stacked generalization architecture. In: The 24th International Symposium on Computer and Information Sciences, ISCIS 2009, 14–16 September 2009, North Cyprus, pp. 88–93. IEEE (2009)
Google Scholar
Olorunnimbe, M.K., Viktor, H.L., Paquet, E.: Intelligent adaptive ensembles for data stream mining: a high return on investment approach. In: Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z.W. (eds.) NFMCP 2015. LNCS, vol. 9607, pp. 61–75. Springer, Cham (2016). doi:10.1007/978-3-319-39315-5_5
Chapter Google Scholar
Pesaranghader, A., Viktor, H.L.: Fast hoeffding drift detection method for evolving data streams. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) ECML PKDD 2016. LNCS, vol. 9852, pp. 96–111. Springer, Cham (2016). doi:10.1007/978-3-319-46227-1_7
Chapter Google Scholar
Ting, K.M., Witten, I.H.: Issues in stacked generalization. J. Artif. Intell. Res. (JAIR) 10, 271–289 (1999)
MATH Google Scholar
Turkov, P., Krasotkina, O., Mottl, V.: Dynamic programming for bayesian logistic regression learning under concept drift. In: Maji, P., Ghosh, A., Murty, M.N., Ghosh, K., Pal, S.K. (eds.) PReMI 2013. LNCS, vol. 8251, pp. 190–195. Springer, Heidelberg (2013). doi:10.1007/978-3-642-45062-4_26
Chapter Google Scholar
Weinert, W.R., Lopes, H.S.: GEPCLASS: a classification rule discovery tool using gene expression programming. In: Li, X., Zaïane, O.R., Li, Z. (eds.) ADMA 2006. LNCS, vol. 4093, pp. 871–880. Springer, Heidelberg (2006). doi:10.1007/11811305_95
Chapter Google Scholar
Wolpert, D.H.: Stacked generalization. Neural Netw. 5(2), 241–259 (1992)
Article Google Scholar
Yeh, I.-C., Lien, C.H.: The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Syst. Appl. 36(2, Part 1), 2473–2480 (2009)
Article Google Scholar
Zeng, T., Tang, C., Xiang, Y., Chen, P., Liu, Y.: A model of immune gene expression programming for rule mining. J. Univ. Comput. Sci. 13(10), 1484–1497 (2007). http://www.jucs.org/jucs_13_10/a_model_of_immune
Google Scholar
Zliobaite, I.: Controlled permutations for testing adaptive classifiers. In: Discovery Science, pp. 365–379 (2011)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Informatics, Faculty of Mathematics, Physics and Informatics, University of Gdańsk, 80-308, Gdańsk, Poland
Joanna Jȩdrzejowicz
Department of Information Systems, Gdynia Maritime University, Morska 83, 81-225, Gdynia, Poland
Piotr Jȩdrzejowicz

Authors

Joanna Jȩdrzejowicz
View author publications
You can also search for this author in PubMed Google Scholar
Piotr Jȩdrzejowicz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joanna Jȩdrzejowicz .

Editor information

Editors and Affiliations

Department of Information Systems, Faculty of Computer Science and Management, Wrocław University of Science and Technology, Wrocław, Poland
Ngoc Thanh Nguyen
Department of Computer Science, University of Cyprus, Nicosia, Cyprus
George A. Papadopoulos
Department of Information Systems, Gdynia Maritime University, Gdynia, Poland
Piotr Jędrzejowicz
Department of Information Systems, Faculty of Computer Science and Management, Wrocław University of Science and Technology, Wrocław, Poland
Bogdan Trawiński
Department of Information Systems, University of Münster, Münster, Germany
Gottfried Vossen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jȩdrzejowicz, J., Jȩdrzejowicz, P. (2017). Gene Expression Programming Ensemble for Classifying Big Datasets. In: Nguyen, N., Papadopoulos, G., Jędrzejowicz, P., Trawiński, B., Vossen, G. (eds) Computational Collective Intelligence. ICCCI 2017. Lecture Notes in Computer Science(), vol 10449. Springer, Cham. https://doi.org/10.1007/978-3-319-67077-5_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-67077-5_1
Published: 07 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67076-8
Online ISBN: 978-3-319-67077-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics