Generalisation and Model Selection in Supervised Learning with Evolutionary Computation

  • Jem J. Rowland
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2611)


EC-based supervised learning has been demonstrated to be an effective approach to forming predictive models in genomics, spectral interpretation, and other problems in modern biology. Longer-established methods such as PLS and ANN are also often successful. In supervised learning, overtraining is always a potential problem. The literature reports numerous methods of validating predictive models in order to avoid overtraining. Some of these approaches can be applied to EC-based methods of supervised learning, though the characteristics of EC learning are different from those obtained with PLS and ANN and selecting a suitably general model can be more dificult. This paper reviews the issues and various approaches, illustrating salient points with examples taken from applications in bioinformatics.


Evolutionary Computation Supervise Learning Model Selection Criterion Sporadic Breast Cancer Linear Genetic Programming 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Fogel, G., Corne, D., eds.: Evolutionary Computation in Bioinformatics. Morgan Kauffmann, San Francisco, CA (2003)Google Scholar
  2. 2.
    Martens, H., Naes, T.: Multivariate calibration. John Wiley, Chichester (1989)Google Scholar
  3. 3.
    Bishop, C.: Neural Networks in Pattern Recognition. Oxford University Press, Oxford, U.K. (1995)Google Scholar
  4. 4.
    Koza, J.: Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge, Mass (1992)Google Scholar
  5. 5.
    Holland, J.: Adaptation in Natural and Artificial Systems. University of Michigan Press (1975)Google Scholar
  6. 6.
    Freitas, A.: Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer Verlag (2002)Google Scholar
  7. 7.
    Taylor, J., Rowland, J.J., Kell, D.B.: Spectral analysis via supervised genetic search with application-specific mutations. In: IEEE Congress on Evolutionary Computation (CEC), Seoul, Korea, IEEE (2001) 481–486Google Scholar
  8. 8.
    Hand, D., Mannila, H., Smyth, P.: Data Mining. MIT Press (2001)Google Scholar
  9. 9.
    Eiben, A., Jelasity, M.: A critical note on experimental research methodology in EC. In: IEEE Congress on Evolutionary Computation (part of WCCI), Hawaii, USA, IEEE (2002) 582–587Google Scholar
  10. 10.
    Brameier, M., Banzhaf, W.: A comparison of linear genetic programming and neural networks in medical data mining. IEEE Transactions on Evolutionary Computation 5 (2001) 17–26CrossRefGoogle Scholar
  11. 11.
    Prechelt, L.: PROBEN1-a set of neural network benchmark problems and benchmarking rules. Technical Report 21/94, Univ. Karlsruhe, Karlsruhe, Germany (1994)Google Scholar
  12. 12.
    Landavazo, D., Fogel, G.: Evolved neural networks for quantitative structureactivity relationships of anti-HIV compounds. In: IEEE Congress on Evolutionary Computation (part of WCCI), Hawaii, USA, IEEE (2002) 199–204Google Scholar
  13. 13.
    McGovern, A., Broadhurst, D., Taylor, J., Gilbert, R., Kaderbhai, N., Winson, M., Small, D., Rowland, J., Kell, D., Goodacre, R.: Monitoring of complex industrial bioprocesses for metabolite concentrations using modern spectroscopies and machine learning: application to gibberellic acid production. Biotechnology & Bioengineering 78 (2002) 527–538CrossRefGoogle Scholar
  14. 14.
    Snee, R.: Validation of regression models. Technometrics 19 (1977) 415–428zbMATHCrossRefGoogle Scholar
  15. 15.
    Cavaretta, M.J., Chellapilla, K.: Data mining using genetic programming: The implications of parsimony on generalization error. In: Proc. IEEE Congress on Evolutionary Computation, Washington; DC (1999) 1330–1337Google Scholar
  16. 16.
    Keijzer, M., Babovic, V.: Genetic programming, ensemble methods and the bias/variance tradeoff-introductory investigations. In: Proc. EuroGP 2000. Volume 1802 of LNCS., Springer-Verlag (2000) 76–90Google Scholar
  17. 17.
    Llorà, X., Goldberg, D., Traus, I., Bernadó, E.: Accuracy, parsimony, and generality in evolutionary learning systems via multiobjective selection. Technical Report 2002016, Illinois Genetic Algorithms Laboratory (2002) Also in IWLCS 2002.Google Scholar
  18. 18.
    Efron, B., Tibshirani, R.: An Introduction to the Bootstrap. Chapman & Hall (1993)Google Scholar
  19. 19.
    Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. Intl. Joint Conf. on Artificial Intelligence 14 (1995) 1137–1145Google Scholar
  20. 20.
    Breiman, L.: Bagging predictors. Technical Report 421, Department of Statistics, University of California, Berkeley (1994)Google Scholar
  21. 21.
    Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: Machine Learning: Proc. Thirteenth Intl. Conference, Morgan Kauffmann (1996) 148–156Google Scholar
  22. 22.
    Quinlan, J.R.: Bagging, boosting, and C4.5. Proceedings of the National Conference on Artifricial Intelligence (1996) 725–730Google Scholar
  23. 23.
    Moore, J.H., Parker, J.S., Olsen, N.J., Aune, T.M.: Symbolic discriminant analysis of microarray data in autoimmune disease. Genetic Epidemiology 23 (2002) 57–69CrossRefGoogle Scholar
  24. 24.
    Ritchie, M.D., Hahn, L.W., Roodi, N., Bailey, L.R., Dupont, W.D., Parl, F.F., Moore, J.H.: Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. American Journal of Human Genetics 69 (2001) 138–147CrossRefGoogle Scholar
  25. 25.
    Johnson, H., Gilbert, R., Winson, M., Goodacre, R., Smith, A., Rowland, J., Hall, M., Kell, D.:Explanatory analysis of the metabolome using genetic programming of simple interpretable rules. Genetic Programming and Evolvable Machines 1 (2000) 243–258zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Jem J. Rowland
    • 1
  1. 1.Dept. of Computer ScienceUniversity of Wales AberystwythWalesUK

Personalised recommendations