Transfer learning in constructive induction with Genetic Programming

Abstract

Transfer learning (TL) is the process by which some aspects of a machine learning model generated on a source task is transferred to a target task, to simplify the learning required to solve the target. TL in Genetic Programming (GP) has not received much attention, since it is normally assumed that an evolved symbolic expression is specifically tailored to a problem’s data and thus cannot be used in other problems. The goal of this work is to present a broad and diverse study of TL in GP, considering a varied set of source and target tasks, and dealing with questions that have received little, or no attention, in previous GP literature. In particular, this work studies the performance of transferred solutions when the source and target tasks are from different domains, and when they do not share a similar input feature space. Additionally, the relationship between the success and failure of transferred solutions is studied, considering different source and target tasks. Finally, the predictability of TL performance is analyzed for the first time in GP literature. GP-based constructive induction of features is used to carry out the study, a wrapper-based approach where GP is used to construct feature transformations and an additional learning algorithm is used to fit the final model. The experimental work presents several notable results and contributions. First, TL is capable of generating solutions that outperform, in many cases, baseline methods in classification and regression tasks. Second, it is shown that some problems are good source problems while others are good targets in a TL system. Third, the transferability of solutions is not necessarily symmetric between two problems. Finally, results show that it is possible to predict the success of TL in some cases, particularly in classification tasks.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Notes

  1. 1.

    The terms original features, raw features and problem features are used to denote the input features (independent variables) that are contained in a machine learning dataset. These terms are used interchangeably in the rest of this paper.

  2. 2.

    We do not provide a complete description of all the works in this area, such a survey is beyond the scope of this paper. Instead, we chose representative papers of the groups and subgroups in our taxonomy of research in this area. We focused on unique papers that illustrate the main features of each group, and included works that have also achieved good performance on real-world tasks.

  3. 3.

    http://gplab.sourceforge.net.

  4. 4.

    These results do not imply that source task features are not important in predicting TL performance, only that target task features are more important for a Random Forest regression model.

  5. 5.

    All of the predictors are plotted in a [0, 1] scale. For predictors with unbounded domains, min-max normalization was used.

References

  1. 1.

    M.M. Najafabadi, F. Villanustre, T.M. Khoshgoftaar, N. Seliya, R. Wald, E. Muharemagic, Deep learning applications and challenges in big data analytics. J. Big Data 2(1), 1 (2015)

    Google Scholar 

  2. 2.

    I. Goodfellow, Y. Bengio, A. Courville, Deep Learning (MIT Press, Cambridge, 2016)

    Google Scholar 

  3. 3.

    S.J. Pan, Q. Yang, A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)

    Google Scholar 

  4. 4.

    J. Yosinski, J. Clune, Y. Bengio, H. Lipson, How transferable are features in deep neural networks? CoRR. arXiv:abs/1411.1792

  5. 5.

    K. Weiss, T.M. Khoshgoftaar, D. Wang, A survey of transfer learning. J. Big Data 3(1), 9 (2016)

    Google Scholar 

  6. 6.

    D. Jackson, A.P. Gibbons, Layered learning in boolean GP problems, in Genetic Programming, ed. by M. Ebner, et al. (Springer, Berlin, 2007), pp. 148–159

    Google Scholar 

  7. 7.

    J.E. Perry, The effect of population enrichment in genetic programming, in Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence, vol. 1 (1994), pp. 456–461

  8. 8.

    W.B. Langdon, J.P. Nordin, Seeding genetic programming populations, in Genetic Programming, ed. by R. Poli, W. Banzhaf, W.B. Langdon, J. Miller, P. Nordin, T.C. Fogarty (Springer, Berlin, 2000), pp. 304–315

    Google Scholar 

  9. 9.

    T.T.H. Dinh, T.H. Chu, N.Q. Uy, Transfer learning in genetic programming, in 2015 IEEE Congress on Evolutionary Computation (CEC) (2015), pp. 1145–1151

  10. 10.

    E. Haslam, B. Xue, M. Zhang, Further investigation on genetic programming with transfer learning for symbolic regression, in IEEE Congress on Evolutionary Computation (CEC) (2016), pp. 3598–3605

  11. 11.

    D. O’Neill, H. Al-Sahaf, B. Xue, M. Zhang, Common subtrees in related problems: a novel transfer learning approach for genetic programming, in IEEE Congress on Evolutionary Computation (CEC) (2017), pp. 1287–1294

  12. 12.

    M. Iqbal, B. Xue, M. Zhang, Reusing extracted knowledge in genetic programming to solve complex texture image classification problems, in Proceedings, Part II, of the 20th Pacific–Asia Conference on Advances in Knowledge Discovery and Data Mining—Volume 9652, PAKDD 2016 (Springer, Berlin, 2016), pp. 117–129

  13. 13.

    M. Iqbal, M. Zhang, B. Xue, Improving classification on images by extracting and transferring knowledge in genetic programming, in IEEE Congress on Evolutionary Computation (CEC) (2016), pp. 3582–3589

  14. 14.

    W. Fu, B. Xue, M. Zhang, X. Gao, Transductive transfer learning in genetic programming for document classification, in Simulated Evolution and Learning, ed. by Y. Shi, et al. (Springer, Cham, 2017), pp. 556–568

    Google Scholar 

  15. 15.

    M. Iqbal, B. Xue, H. Al-Sahaf, M. Zhang, Cross-domain reuse of extracted knowledge in genetic programming for image classification. IEEE Trans. Evol. Comput. 21(4), 569–587 (2017)

    Google Scholar 

  16. 16.

    M. Iqbal, H. Al-Sahaf, B. Xue, M. Zhang, Genetic programming with transfer learning for texture image classification. Soft Comput. 23(23), 12859–12871 (2019). https://doi.org/10.1007/s00500-019-03843-5

    Article  Google Scholar 

  17. 17.

    J. Wnek, R.S. Michalski, Hypothesis-driven constructive induction in AQ17-HCI: a method and experiments. Mach. Learn. 14(2), 139–168 (1994)

    MATH  Google Scholar 

  18. 18.

    H. Bensusan, I. Kuscu, Constructive induction using genetic programming, in Evolutionary Computing and Machine Learning Workshop (Morgan Kaufmann, Burlington, 1996)

  19. 19.

    L. Muñoz, L. Trujillo, S. Silva, M. Castelli, L. Vanneschi, Evolving multidimensional transformations for symbolic regression with M3GP. Memet. Comput. 11, 111–126 (2019)

    Google Scholar 

  20. 20.

    Y. Martínez, L. Trujillo, P. Legrand, E. Galván-López, Prediction of expected performance for a genetic programming classifier. Genet. Program. Evolvable Mach. 17(4), 409–449 (2016)

    Google Scholar 

  21. 21.

    J.R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection, vol. 1 (MIT Press, Cambridge, 1992)

    Google Scholar 

  22. 22.

    P. Stone, M. Veloso, Layered learning, in Machine Learning: ECML 2000 (Proceedings of the Eleventh European Conference on Machine Learning), ed. by R.L. de Mántaras, E. Plaza (Springer, Barcelona, 2000), pp. 369–381

    Google Scholar 

  23. 23.

    M. Keijzer, C. Ryan, M. Cattolico, Run transferable libraries—learning functional bias in problem domains, in Genetic and Evolutionary Computation–GECCO 2004, ed. by K. Deb (Springer, Berlin, 2004), pp. 531–542

    Google Scholar 

  24. 24.

    G. Murphy, C. Ryan, D. Howard, (Seeding methods for run transferable libraries) Capturing domain relevant functionality through schematic manipulation for genetic programming, in 2007 Frontiers in the Convergence of Bioscience and Information Technologies (2007), pp. 769–772

  25. 25.

    G. Murphy, C. Ryan, Seeding methods for run transferable libraries, in Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation, GECCO’07 (ACM, New York, 2007), pp. 1755–1755

  26. 26.

    M.D. Schmidt, H. Lipson, Incorporating expert knowledge in evolutionary search: a study of seeding methods, in Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation, GECCO ’09 (ACM, New York, 2009), pp. 1091–1098

  27. 27.

    L. Vanneschi, I. Bakurov, M. Castelli, An initialization technique for geometric semantic GP based on demes evolution and despeciation, in IEEE Congress on Evolutionary Computation (CEC) (2017), pp. 113–120

  28. 28.

    C.H. Westerberg, J. Levine, Investigation of different seeding strategies in a genetic planner, in Proceedings of the EvoWorkshops on Applications of Evolutionary Computing (Springer, Berlin, 2001), pp. 505–514

  29. 29.

    J.H. Moore, B.C. White, Exploiting expert knowledge in genetic programming for genome-wide genetic analysis, in Parallel Problem Solving from Nature—PPSN IX, ed. by T.P. Runarsson, et al. (Springer, Berlin, 2006), pp. 969–977

    Google Scholar 

  30. 30.

    H. Ahmad, T. Helmuth, A comparison of semantic-based initialization methods for genetic programming, in Proceedings of the Genetic and Evolutionary Computation Conference Companion, GECCO ’18 (ACM, New York, 2018), pp. 1878–1881

  31. 31.

    I. Tanev, T. Kuyucu, K. Shimohara, Gp-induced and explicit bloating of the seeds in incremental GP improves evolutionary success. Genet. Program. Evolvable Mach. 15(1), 37–60 (2014)

    Google Scholar 

  32. 32.

    C.J. Matheus, A constructive induction framework, in Proceedings of the Sixth International Workshop on Machine Learning, ed. by A.M. Segre (Morgan Kaufmann, San Francisco, 1989), pp. 474–475

    Google Scholar 

  33. 33.

    L. Altenberg, Evolving better representations through selective genome growth, in Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence, vol. 1 (1994), pp. 182–187

  34. 34.

    H. Vafaie, K. De Jong, Genetic algorithms as a tool for restructuring feature space representations, in Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence (1995), pp. 8–11

  35. 35.

    J. Sherrah, R.E. Bogner, B. Bouzerdoum, Automatic selection of features for classification using genetic programming, in Australian and New Zealand Conference on Intelligent Information Systems, 1996 (1996), pp. 284–287

  36. 36.

    M. Hinchliffe, H. Hiden, B. McKay, M. Willis, M. Tham, G. Barton, Modelling chemical process systems using a multi-gene genetic programming algorithm, in Late Breaking Papers at the Genetic Programming 1996 Conference Stanford University July 28–31, 1996, ed. by J.R. Koza (Stanford University, Stanford, 1996), pp. 56–65

  37. 37.

    J.R. Sherrah, R.E. Bogner, A. Bouzerdoum, The evolutionary pre-processor: Automatic feature extraction for supervised classification using genetic programming, in Proceedings of the 2nd International Conference on Genetic Programming, (GP-97) (Morgan Kaufmann, 1997), pp. 304–312

  38. 38.

    R.S. Michalski, A theory and methodology of inductive learning. Artif. Intell. 20(2), 111–161 (1983)

    MathSciNet  Google Scholar 

  39. 39.

    M. Willis, H. Hiden, G. Montague, Developing inferential estimation algorithms using genetic programming, in it IFAC Proceedings, iFAC Symposium on Advanced Control of Chemical Processes 1997 (ADCHEM ’97), Banff, Canada, 9–11 June vol. 30(9) (1997), pp. 209–214

  40. 40.

    M. Willis, H. Hiden, M. Hinchliffe, B. McKay, G.W. Barton, Systems modelling using genetic programming. Comput. Chem. Eng. 21, S1161–S1166 (1997)

    Google Scholar 

  41. 41.

    S. Bleuler, M. Brack, L. Thiele, E. Zitzler, Multiobjective genetic programming: reducing bloat using SPEA2, in Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No. 01TH8546), vol. 1 (2001), pp. 536–543

  42. 42.

    K. Krawiec, Genetic programming-based construction of features for machine learning and knowledge discovery tasks. Genet. Program. Evolvable Mach. 3(4), 329–343 (2002)

    MATH  Google Scholar 

  43. 43.

    K. Krawiec, L. Włodarski, Coevolutionary feature construction for transformation of representation of machine learners, in Intelligent Information Processing and Web Mining, ed. by M.A. Kłopotek, S.T. Wierzchoń, K. Trojanowski (Springer, Berlin, 2004), pp. 139–150

    Google Scholar 

  44. 44.

    Y. Zhang, P.I. Rockett, A generic optimal feature extraction method using multiobjective genetic programming, Tech. Rep. VIE 2006/001, University of Sheffield, Department of Electronic and Electrical Engineering (2006)

  45. 45.

    Y. Li, X. Wei, Linear-in-parameter models based on parsimonious genetic programming algorithm and its application to aero-engine start modeling. Chin. J. Aeronaut. 19(4), 295–303 (2006)

    Google Scholar 

  46. 46.

    D. Searson, M. Willis, G. Montague, Co-evolution of non-linear PLS model components. J. Chemom. 21(12), 592–603 (2007)

    Google Scholar 

  47. 47.

    J.-Y. Lin, H.-R. Ke, B.-C. Chien, W.-P. Yang, Classifier design with feature selection and feature extraction using layered genetic programming. Expert Syst. Appl. 34, 1384–1393 (2008)

    Google Scholar 

  48. 48.

    Y. Zhang, P.I. Rockett, A generic multi-dimensional feature extraction method using multiobjective genetic programming. Evol. Comput. 17(1), 89–115 (2009)

    Google Scholar 

  49. 49.

    X.-K. Wei, Y.-H. Li, Y. Feng, Parsimonious genetic programming for complex process intelligent modeling: algorithm and applications. Neural Comput. Appl. 19(2), 329–335 (2010)

    MathSciNet  Google Scholar 

  50. 50.

    D.P. Searson, D.E. Leahy, M.J. Willis, GPTIPS: an open source genetic programming toolbox for multigene symbolic regression, in International Multiconference of Engineers and Computer Scientists 2010 (IMECS 2010), vol. 3 (Newswood Ltd, London, 2010), pp. 77–80

  51. 51.

    G.A. Morrison, D.P. Searson, M.J. Willis, Using genetic programming to evolve a team of data classifiers. Int. J. Comput. Electr. Autom. Control Inf. Eng. 4(12), 1815–1818 (2010)

    Google Scholar 

  52. 52.

    L. Guo, D. Rivero, J. Dorado, C.R. Munteanu, A. Pazos, Automatic feature extraction using genetic programming: an application to epileptic EEG classification. Expert Syst. Appl. 38(8), 10425–10436 (2011)

    Google Scholar 

  53. 53.

    T. McConaghy, FFX: Fast, Scalable, Deterministic Symbolic Regression Technology (Springer, New York, 2011), pp. 235–260

    Google Scholar 

  54. 54.

    A.H. Gandomi, A.H. Alavi, A new multi-gene genetic programming approach to nonlinear system modeling. Part I: materials and structural engineering problems. Neural Comput. Appl. 21(1), 171–187 (2012)

    Google Scholar 

  55. 55.

    A.H. Gandomi, A.H. Alavi, A new multi-gene genetic programming approach to non-linear system modeling. Part II: geotechnical and earthquake engineering problems. Neural Comput. Appl. 21(1), 189–201 (2012)

    Google Scholar 

  56. 56.

    I. Icke, J.C. Bongard, Improving genetic programming based symbolic regression using deterministic machine learning, in IEEE Congress on Evolutionary Computation (2013), pp. 1763–1770

  57. 57.

    L. Shao, L. Liu, X. Li, Feature learning for image classification via multiobjective genetic programming. IEEE Trans. Neural Netw. Learn. Syst. 25(7), 1359–1371 (2014)

    Google Scholar 

  58. 58.

    V. Ingalalli, S. Silva, M. Castelli, L. Vanneschi, A multi-dimensional genetic programming approach for multi-class classification problems, in 17th European Conference on Genetic Programming, vol. 8599, LNCS, ed. by M. Nicolau, et al. (Springer, Granada, 2014), pp. 48–60

    Google Scholar 

  59. 59.

    V.V. De Melo, Kaizen programming, in Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, GECCO ’14 (ACM, New York, 2014), pp. 895–902

  60. 60.

    I. Arnaldo, K. Krawiec, U.-M. O’Reilly, Multiple regression genetic programming, in Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, GECCO ’14 (ACM, New York, 2014), pp. 879–886

  61. 61.

    A. Garg, K. Tai, An improved multi-gene genetic programming approach for the evolution of generalized model in modelling of rapid prototyping process, in Modern Advances in Applied Intelligence, ed. by M. Ali, J.-S. Pan, S.-M. Chen, M.-F. Horng (Springer, Cham, 2014), pp. 218–226

    Google Scholar 

  62. 62.

    L. Muñoz, S. Silva, L. Trujillo, M3GP—multiclass classification with GP, in Genetic Programming: 18th European Conference, EuroGP 2015, Copenhagen, Denmark, April 8–10, 2015, Proceedings (Springer, Cham, 2015), pp. 78–91

  63. 63.

    I. Arnaldo, U.-M. O’Reilly, K. Veeramachaneni, Building predictive models via feature synthesis, in Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation, GECCO ’15 (ACM, New York, 2015), pp. 983–990

  64. 64.

    D.P. Searson, GPTIPS 2: an open-source software platform for symbolic data mining. CoRR. arXiv:abs/1412.4690

  65. 65.

    V.V. de Melo, W. Banzhaf, Kaizen Programming for Feature Construction for Classification (Springer, Cham, 2016), pp. 39–57

    Google Scholar 

  66. 66.

    S. Silva, L. Muñoz, L. Trujillo, V. Ingalalli, M. Castelli, L. Vanneschi, Multiclass Classification Through Multidimensional Clustering (Springer, Cham, 2016), pp. 219–239

    Google Scholar 

  67. 67.

    W. La Cava, J. Moore, A general feature engineering wrapper for machine learning using epsilon-lexicase survival, in Genetic Programming, ed. by J. McDermott, et al. (Springer, Cham, 2017), pp. 80–95

    Google Scholar 

  68. 68.

    W. La Cava, J.H. Moore, Ensemble representation learning: an analysis of fitness and survival for wrapper-based genetic programming methods (2017), pp. 961–968

  69. 69.

    W. La Cava, S. Silva, L. Vanneschi, L. Spector, J. Moore, Genetic programming representations for multi-dimensional feature learning in biomedical classification, in Applications of Evolutionary Computation, ed. by G. Squillero, K. Sim (Springer, Cham, 2017), pp. 158–173

    Google Scholar 

  70. 70.

    A.L.F. Novaes, R. Tanscheit, D.M. Dias, Econometric genetic programming outperforms traditional econometric algorithms for regression tasks, in Proceedings of the Genetic and Evolutionary Computation Conference Companion, GECCO ’17 (ACM, New York, 2017), pp. 1427–1430

  71. 71.

    A.L.F. Novaes, R. Tanscheit, D.M. Dias, Econometric genetic programming in binary classification: evolving logistic regressions through genetic programming, in Progress in Artificial Intelligence, ed. by E. Oliveira, J. Gama, Z. Vale, H. Lopes Cardoso (Springer, Cham, 2017), pp. 382–394

    Google Scholar 

  72. 72.

    E. Dunn, G. Olague, E. Lutton, Parisian camera placement for vision metrology. Pattern Recognit. Lett. 27(11), 1209–1219 (2006). (Evolutionary computer vision and image understanding)

    Google Scholar 

  73. 73.

    A.O.H. Gitlow, S. Gitlow, R. Oppenheim, Tools and Methods for the Improvement of Quality, Irwin Series in Qualitative Analysis for Business (Taylor & Francis, Milton Park, 1989)

    Google Scholar 

  74. 74.

    W.L. Cava, S. Silva, K. Danai, L. Spector, L. Vanneschi, J.H. Moore, Multidimensional genetic programming for multiclass classification. Swarm Evol. Comput. 44, 260–272 (2019)

    Google Scholar 

  75. 75.

    J.H. Friedman, Multivariate adaptive regression splines. Ann. Stat. 19(1), 1–67 (1991)

    MathSciNet  MATH  Google Scholar 

  76. 76.

    A. Moraglio, K. Krawiec, C.G. Johnson, Geometric Semantic Genetic Programming (Springer, Berlin, 2012), pp. 21–31

    Google Scholar 

  77. 77.

    I. Kojadinovic, On the use of mutual information in data analysis: an overview, in Proceedings of the International Symposium on Applied Stochastic Models Data Analysis (2005), pp. 738–47

  78. 78.

    S. Luke, L. Panait, Lexicographic parsimony pressure, in Proceedings of the 4th Annual Conference on Genetic and Evolutionary Computation, GECCO’02 (Morgan Kaufmann Publishers, Burlington, 2002), pp. 829–836

  79. 79.

    J. Alcalá-Fdez, A. Fernández, J. Luengo, J. Derrac, S. García, Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. Mult. Valued Log. Soft Comput. 17(2–3), 255–287 (2011)

    Google Scholar 

  80. 80.

    D. Dua, C. Graff, UCI Machine Learning Repository (University of California, School of Information and Computer Science, Irvine, CA, 2019). http://archive.ics.uci.edu/ml

  81. 81.

    J. Gerritsma, R. Onnink, A. Versluis, Geometry, resistance and stability of the delft systematic yacht hull series. Int. Shipbuilding Prog. 28, 276–297 (1981)

    Google Scholar 

  82. 82.

    I.-C. Yeh, Modeling of strength of high-performance concrete using artificial neural networks. Cem. Concr. Res. 28(12), 1797–1808 (1998)

    Google Scholar 

  83. 83.

    A. Tsanas, A. Xifara, Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy Build. 49, 560–567 (2012)

    Google Scholar 

  84. 84.

    D. Harrison, D.L. Rubinfeld, Hedonic housing prices and the demand for clean air. J. Environ. Econom. Manag. 5(1), 81–102 (1978)

    MATH  Google Scholar 

  85. 85.

    E.J. Vladislavleva, G.F. Smits, D. den Hertog, Order of nonlinearity as a complexity measure for models generated by symbolic regression via pareto genetic programming. IEEE Trans. Evol. Comput. 13(2), 333–349 (2009)

    Google Scholar 

Download references

Acknowledgements

This research was funded by CONACYT (Mexico) Fronteras de la Ciencia 2015-2 Project No. FC-2015-2/944, and first author was supported by CONACYT graduate scholarship No. 302526. This work was also partially supported by FCT through funding of LASIGE Research Unit (UID/CEC/00408/2019), and projects PERSEIDS (PTDC/EMS-SIS/0642/2014), INTERPHENO (PTDC/ASP-PLA/28726/2017), OPTOX (PTDC/CTA-AMB/30056/2017), BINDER (PTDC/CCI-INF/29168/2017), PREDICT (PTDC/CCI-CIF/29877/2017) and GADgET (DSAIPA/DS/0022/2018). The authors also thank Mauro Castelli from NOVA IMS for suggesting important references on transfer learning with GP.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Leonardo Trujillo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Muñoz, L., Trujillo, L. & Silva, S. Transfer learning in constructive induction with Genetic Programming. Genet Program Evolvable Mach 21, 529–569 (2020). https://doi.org/10.1007/s10710-019-09368-y

Download citation

Keywords

  • Transfer learning
  • Constructive induction of features
  • Genetic Programming