Skip to main content

Better Solutions Faster: Soft Evolution of Robust Regression Models InParetogeneticprogramming

  • Chapter
Genetic Programming Theory and Practice V

“Better solutions faster” is the reality of the industrial modeling world, now more than ever. Efficiency requirements, market pressures, and ever changing data force us to use symbolic regression via genetic programming (GP) in a highly automated fashion. This is why we want our GP system to produce simple solutions of the highest possible quality with the lowest computational effort, and a high consistency in the results of independent GP runs.

In this chapter, we show that genetic programming with a focus on ranking in combination with goal softening is a very powerful way to improve the efficiency and effectiveness of the evolutionary search. Our strategy consists of partial fitness evaluations of individuals on random subsets of the original data set, with a gradual increase in the subset size in consecutive generations. From a series of experiments performed on three test problems, we observed that those evolutions that started from the smallest subset sizes (10%) consistently led to results that are superior in terms of the goodness of fit, consistency between independent runs, and computational effort. Our experience indicates that solutions obtained using this approach are also less complex and more robust against over-fitting.

We find that the near-optimal strategy of allocating computational budget over a GP run is to evenly distribute it over all generations. This implies that initially, more individuals can be evaluated using small subset sizes, promoting better exploration. Exploitation becomes more important towards the end of the run, when all individuals are evaluated using the full data set with correspondingly smaller population sizes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Gathercole, Chris and Ross, Peter (1994). Dynamic training subset selection for supervised learning in genetic programming. In Davidor, Yuval, Schwefel, Hans-Paul, and M änner, Reinhard, editors, Parallel Problem Solving from Nature III, volume 866 of LNCS, pages 312-321, Jerusalem. Springer-Verlag.

    Google Scholar 

  • Ho, Y.-C., Cassandras, C.G., Chen, C.-H, and Dai, L (2000). Ordinal optimization and simulation. Journal of the Operational Research Society, 51:490-500.

    Article  MATH  Google Scholar 

  • Ho, Yu-Chi (2000). Soft optimization for hard problems, computerized lecture via private communication/distribution.

    Google Scholar 

  • Keijzer, Maarten and Foster, James (2007). Crossover bias in genetic programming. In Ebner, Marc, O’Neill, Michael, Ek árt, Anik ó, Vanneschi, Leonardo, and Esparcia-Alc ázar, Anna Isabel, editors, Proceedings of the 10th European Conference on Genetic Programming, volume 4445 of Lecture Notes in Computer Science, Valencia, Spain. Springer.

    Google Scholar 

  • Kotanchek, Mark, Smits, Guido, and Vladislavleva, Ekaterina (2006). Pursuing the pareto paradigm tournaments, algorithm variations & ordinal optimization. In Riolo, Rick L., Soule, Terence, and Worzel, Bill, editors, Genetic Programming Theory and Practice IV, volume 5 of Genetic and Evolutionary Computation, chapter 3, pages -. Springer, Ann Arbor.

    Google Scholar 

  • Langdon, W. B. and Poli, Riccardo (2002). Foundations of Genetic Programming. Springer-Verlag.

    Google Scholar 

  • Lau, T.W. Edward and Ho, Yu-Chi (1997). Universal alignment probabilities and subset selection for ordinal optimization. J. Optim. Theory Appl., 93(3):455-489.

    Article  MATH  MathSciNet  Google Scholar 

  • Laumanns, Marco, Thiele, Lothar, Zitzler, Eckart, and Deb, Kalyanmoy (2002). Archiving with guaranteed convergence and diversity in multi-objective optimization. In GECCO, pages 439-447.

    Google Scholar 

  • Smits, Guido, Kordon, Arthur, Vladislavleva, Katherine, Jordaan, Elsa, and Kotanchek, Mark (2005). Variable selection in industrial datasets using pareto genetic programming. In Yu, Tina, Riolo, Rick L., and Worzel, Bill, editors, Genetic Programming Theory and Practice III, volume 9 of Genetic Programming, chapter 6, pages 79-92. Springer, Ann Arbor.

    Google Scholar 

  • Smits, Guido and Kotanchek, Mark (2004). Pareto-front exploitation in symbolic regression. In O’Reilly, Una-May, Yu, Tina, Riolo, Rick L., and Worzel, Bill, editors, Genetic Programming Theory and Practice II, chapter 17, pages 283-299. Springer, Ann Arbor.

    Google Scholar 

  • Smits, Guido and Vladislavleva, Ekaterina (2006). Ordinal pareto genetic programming. In Yen, Gary G., Wang, Lipo, Bonissone, Piero, and Lucas, Simon M., editors, Proceedings of the 2006 IEEE Congress on Evolutionary Computation, pages 3114 - 3120, Vancouver. IEEE Press.

    Google Scholar 

  • Teller, Astro and Andre, David (1997). Automatically choosing the number of fitness cases: The rational allocation of trials. In Koza, John R., Deb, Kalyanmoy, Dorigo, Marco, Fogel, David B., Garzon, Max, Iba, Hitoshi, and Riolo, Rick L., editors, Genetic Programming 1997: Proceedings of the Second Annual Conference, pages 321-328, Stanford University, CA, USA. Morgan Kaufmann.

    Google Scholar 

  • Zhang, Byoung-Tak and Cho, Dong-Yeon (1998). Genetic programming with active data selection. In McKay, R. I. Bob, Yao, X., Newton, Charles S., Kim, J.-H., and Furuhashi, T., editors, Simulated Evolution and Learning: Second Asia-Pacific Conference on Simulated Evolution and Learning, SEAL’98. Selected Papers, volume 1585 of LNAI, pages 146-153, Australian Defence Force Academy, Canberra, Australia. Springer-Verlag. published in 1999.

    Google Scholar 

  • Zitzler, Eckart and Thiele, Lothar (1998). Multiobjective optimization using evolutionary algorithms - a comparative case study. In Eiben, A. E., B äck, Thomas, Schoenauer, Marc, and Schwefel, Hans-Paul, editors, PPSN, volume 1498 of Lecture Notes in Computer Science, pages 292-304. Springer.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Vladislavleva, E., Smits, G., Kotanchek, M. (2008). Better Solutions Faster: Soft Evolution of Robust Regression Models InParetogeneticprogramming. In: Riolo, R., Soule, T., Worzel, B. (eds) Genetic Programming Theory and Practice V. Genetic and Evolutionary Computation Series. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-76308-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-76308-8_2

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-76307-1

  • Online ISBN: 978-0-387-76308-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics