Model extraction from data has at least three objectives. It aims to produce accurate, comprehensible and interesting models. Hence, multi-objective evolutionary algorithms are a natural choice to tackle this problem. They are capable of optimizing several incommensurable objectives in as single run without making any assumptions about the importance of each objective. This chapter proposes several multi-objective evolutionary algorithms to tackle three different model extraction tasks. The first approach performs supervised classification whilst overcoming some of the shortcomings of existing approaches. The second and third approach tackle two survival analysis problems. All approaches are evaluated on artificial, benchmark and real-world medical data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Freitas A A, Rozenberg G (2002) Data mining and knowledge discovery with evolutionary algorithms. Springer Verlag, Berlin, Heidelberg, New York
Sobin L H, Wittekind C (2002) Classification of malignant tumours. Wiley-Liss
Russell S J, Norvig P (1994) Artificial intelligence: A modern approach. Prentice Hall
Holland J H (1975) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. University of Michigan Press, Ann Arbor
Michie D, Spiegelhalter D J, Taylor C C (1994) Machine learning, neural and statistical classification. Ellis Horwood
Krause P, Clark D (1993) Representing uncertain knowledge: An artificial intelligence approach. Kluwer Academic Publishers
Michalewicz Z, Fogel D B (2005) How to solve it: modern heuristics. 2nd Edition, Springer, Berlin
Gordon A D (1981) Classification. Chapman and Hall
Bishop C M (1995) Neural networks for pattern recognition. Oxford University Press, Oxford
Kleinbaum D G (1996) Survival analysis: A self-learning text. Springer
Afifi A, Clark V A, May S (2003) Computer-aided multivariate analysis. Chapman and Hall
Kalbfleisch J D, Prentice R L (1980) The statistical analysis of failure time data. Wiley
Evans M, Hastings N, Peacock B (1993) Statistical distributions. John Wiley and Sons
Lawless J F (2002) Statistical models and methods for lifetime data. Wiley, New York
Damato B E (2000) Ocular tumours : diagnosis and treatment. Butterworth Heinemann
Klir G J, Clair U S, Yuan B (1997) Fuzzy set theory: foundations and applications. Prentice Hall, Upper Saddle River, NJ
Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley Chichester
Quinlan J R (1994) C4.5 : programs for machine learning. Morgan Kaufmann
Duda R O, Hart P E, Stork H G (2000) Pattern classification. Wiley-Interscience, New York
Silverman B W (1999) Density estimation for statistics and data analysis. Chapman and Hall
Fayyad U M, Piatetsky-Shapiro G, Smyth P (1996) From data mining to knowledge discovery in databases. Advances in Knowledge Discovery and Data Mining, AAAI Press/The MIT Press 1–36
Setzkorn C, Taktak A F, Damato B (2006) Evolving oblique decision trees for survival analysis. Poster Proceedings of the 6th Industrial Conference on Data Mining, Springer 144–158
Freitas A A (2000) Evolutionary algorithms. Handbook of Data Mining and Knowledge Discovery
Provost F, Fawcett T, Kohavi R (1998) The case against accuracy estimation for comparing induction algorithms. Proc. 15th International Conf. on Machine Learning, Morgan Kaufmann, San Francisco, CA 445–453
Fonseca C M, Fleming P J (1995) Multiobjective genetic algorithms made easy: selection, sharing, and mating restriction. Proceedings of the First International Conference on Genetic Algorithms in Engineering Systems: Innovations and Applications, Sheffield, UK 42–52
Holland J H, Reitman J S (2002) Cognitive systems based on adaptive algorithms. In: D. A. Waterman and F. Hayes-Roth (eds) Pattern Directed Inference Systems, Academic Press 313–329
Holland J H (1986) Escaping brittleness: The possibility of general-purpose learning algorithms applied to rule-based systems. In: R. S. Michalski and J. G. Carbonell and T. M. Mitchell (eds) Machine Learning: An Artificial Intelligence Approach, Volume II, Morgan Kaufmann 593–623
Fawcett T (2001) Using rule sets to maximize ROC performance. Proceedings of the IEEE International Conference on Data Mining, IEEE Computer Society 131–138
Spears W M (1995) Adapting crossover in evolutionary algorithms. In: J. R. McDonnell and R. G. Reynolds and D. B. Fogel (eds) Proc. of the Fourth Annual Conference on Evolutionary Programming, MIT Press, Cambridge, MA 367–384
Laumanns M, Thiele L, Zitzler E, Deb K (2002) Archiving with guaranteed convergence and diversity in multi-objective optimization. In: Langdon W B, Cantú-Paz E, Mathias K, Roy R, Davis D, Poli R, Balakrishnan K, Honavar V, Rudolph G, Wegener J, Bull L Potter M A, Schultz A C, Miller J F, Burke E, Jonoska N (eds) GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference, Morgan Kaufmann Publishers, San Francisco, CA 94104, USA 439–447
Setzkorn C, Paton R C (2004) JavaSpaces-an affordable technology for the simple implementation of reusable parallel evolutionary algorithms. In: López J A, Benfenati E, Dubitzky W (eds) Knowledge Exploration in Life Science Informatics - KELSI 2004 (LNAI 3303), Springer-Verlag, New York 151–161
Laumanns M, Zitzler E, Thiele L (2000) A unified model for multi-objective evolutionary algorithms with elitism. Proceedings of the 2000 Congress on Evolutionary Computation (CEC 2000). IEEE Press, Piscataway, New Jersey 46–53
Hoffmann F, Pfister G (1995) A new learning method for the design of hierarchical fuzzy controllers using messy genetic algorithms. Proceedings of the Sixth International Fuzzy Systems Association World Congress (IFSA’95). Sao Paulo, Brazil 249–252
Clearwater S, Provost F (1990) RL4: A tool for knowledge-based induction. Proceedings of the Second International IEEE Conference on Tools for Artificial Intelligence 24–30
Duch W, Jankowski N, Grabczewski K, Adamczak R (2000) Optimization and interpretation of rule-based classifiers. Intelligent Information Systems, Advances in Soft Computing 1–14
Nürnberger A, Klose A, Kruse R (2000) Analyzing borders between partially contradicting fuzzy classification rules. Proceedings of 19th International Conference of the North American Fuzzy Information Processing Society (NAFIPS 2000) 59–63
Cramer N L (1985) A representation for the adaptive generation of simple sequential programs. Proceedings of the International Conference on Genetic Algorithms and Their Applications 183–187
Koza J R (1998) Genetic programming. In: Williams James G, Kent A (eds) Encyclopedia of Computer Science and Technology, Marcel-Dekker 29–43
Zitzler E, Laumanns M, Thiele L (2001) SPEA2: Improving the strength pareto evolutionary algorithm. EUROGEN 2001 - Evolutionary Methods for Design, Optimisation and Control with Applications to Industrial Problems 19–26
Burges C J C (1998) A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2(2):121–167
Provost F J, Aronis J M (1996) Scaling up inductive learning with massive parallelism. Machine Learning 23(1):33–46
Dhar V, Chou D, Provost F J (2000) Discovering interesting patterns for investment decision making with GLOWER - A genetic learner overlaid with entropy reduction. Data Mining and Knowledge Discovery 4(4):251–280
Eiben A E, Hinterding R, Michalewicz Z (1999) Parameter control in evolutionary algorithms. IEEE Trans. on Evolutionary Computation 3(2):124–141
Grohman W M, Dhawan A P (2001) Fuzzy convex set-based pattern classification for analysis of mammographic microcalcifications. Pattern Recognition 34:1469–1482
Nauck D, Kruse R (1999) Obtaining interpretable fuzzy classification rules from medical data. Artificial Intelligence in Medicine 16:149–169
Russo M (1997) FuGeNeSys - a fuzzy genetic neural system for fuzzy modeling. IEEE Transactions On Fuzzy Systems 6(3):373–388
Jain A, Duin P, Mao J (2000) Statistical pattern recognition: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(1):4–37
Damato B E (2005) Current management of uveal melanoma. European Journal of Cancer Supplements 3(3):433–435
Singer J D, Willett J B (1993) It’s about time: Using discrete-time survival analysis to study duration and the timing of events. Journal of Educational Statistics 18(2):155–195
Setzkorn C, Taktak A F, Damato B (2006) On the use of multi-objective evolutionary algorithms for survival analysis. BioSystems (in press).
Freireich E J et al.(1963) The Effect of 6-mercaptopurine on the duration of steroid-induced remissions in acute leukemia. Blood 21:699–716
Setzkorn C, Paton R C (2005) On the use of multi-objective evolutionary algorithms for the induction of fuzzy classification rule systems. BioSystems 81(2):101–112
Eleuteri A, Tagliaferri R, Milano L, De Placido S, De Laurentiis M (2003) A novel neural network-based survival analysis model. Neural Networks 16(5–6):855–864
Hand D J, Till R J (2001) A simple generalization of the area under the ROC curve for multiple class classification problems. Machine Learning 45:171–186
Hanley J, McNeil B J (1982) The meaning and use of the area under a receiver operating characteristic ROC curve. Radiology 143:29–36
Bradley A (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7):1145–1159
Cordón O, del Jesus M J, Herrera F, Lozano M (1999) A proposal on reasoning methods in fuzzy rule-based classification systems. International Journal of Approximate Reasoning 20:21–45
Steimann F (1997) Fuzzy set theory in medicine. Artificial Intelligence in Medicine 11(1):1–7
Zadeh L A (1965) Fuzzy sets. Information and Control 8(3): 338–353
Biganzoli E, Boracchi P, Mariani L, Marubini E (1998) Feed forward neural networks for the analysis of censored survival data: a partial logistic regression approach. Statistics in Medicine 17(10):1169–1186
Clark T G, Bradburn M J, Love S B, Altman D G (2003) Survival analysis part IV: Further concepts and methods in survival analysis. British Journal of Cancer 89:781–786
De Jong K A (1975) An analysis of the behaviour of a class of genetic adaptive systems. PhD thesis, University of Michigan
Setzkorn C (2005) On the use of multi-objective evolutionary algorithms for classification rule induction. PhD Thesis, University of Liverpool, Department of Computer Science Liverpool, United Kingdom
Smith S F (1980) A learning system based on genetic algorithm. PhD Thesis, University of Pittburgh
Setzkorn C, Paton R C (2003) MERBIS - A multi-objective evolutionary rule base induction system. ULCS-03-016, University of Liverpool
Sebag M, Azé J, Lucas N (2004) ROC-based evolutionary learning: application to medical data mining. Artificial Evolution, 384–396
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Setzkorn, C. (2008). Classification and Survival Analysis Using Multi-objective Evolutionary Algorithms. In: Ghosh, A., Dehuri, S., Ghosh, S. (eds) Multi-Objective Evolutionary Algorithms for Knowledge Discovery from Databases. Studies in Computational Intelligence, vol 98. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77467-9_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-77467-9_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77466-2
Online ISBN: 978-3-540-77467-9
eBook Packages: EngineeringEngineering (R0)