Abstract
A novel way of comparing supervised learning algorithms has been introduced since the late 90’s, based on Receiver Operating Characteristics (ROC) curves.
From this approach is derived a NP complete optimization criterion for supervised learning, the area under the ROC curve.
This optimization criterion, tackled with evolution strategies, is experimentally compared to the structural risk criterion tackled by quadratic optimization in Support Vector Machines. Comparable results are obtained on a set of benchmark problems in the Irvine repository, within a fraction of the SVM computational cost.
Additionally, the variety of solutions provided by evolutionary computation can be exploited for visually inspecting the contributing factors of the phenomenon under study. The impact study and sensitivity analysis facilities offered by ROGER (ROC-based Genetic LearneR) are demonstrated on a medical application, the identification of Atherosclerosis Risk Factors.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bäck, T.: Evolutionary Algorithms in theory and practice. Oxford University Press, New York (1995)
Bousquet, O., Elisseeff, A.: Stability and generalization. Journal of Machine Learning Research 2, 499–526 (2002)
Blake, C., Keogh, E., Merz, C.J.: UCI Repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition (1997)
Collobert, R., Bengio, S.: Svmtorch: Support vector machines for largescale regression problems. Journal of Machine Learning Research 1, 143–160 (2001)
Card, S.K., Mackinlay, J.D., Shneiderman, B.: Information Visualization: Using vision to think. Morgan Kaufmann, San Francisco (1999)
Cohen, W.W., Schapire, R.E., Singer, Y.: Learning to order things. In: Advances in Neural Information Processing Systems, vol. 10, The MIT Press, Cambridge (1998)
Cristianini, N., Shawe-Taylor, J.: An introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)
Deb, K.: Multi-Objective Optimization Using Evolutionary Algorithms. John Wiley, Chichester (2001)
Dietterich, T.G.: Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation (1998)
Domingos, P.: Meta-cost: A general method for making classifiers cost sensitive. In: Knowledge Discovery from Databases, pp. 155–164. Morgan Kaufmann, San Francisco (1999)
Furnkranz, J., Flach, P.: An analysis of rule evaluation metrics. In: Proc. of the 20th Int. Conf. on Machine Learning, Morgan Kaufmann, San Francisco (2003)
Ferri, C., Flach, P.A., Hernandez-Orallo, J.: Learning decision trees using the area under the ROC curve. In: Morgan Kaufmann (ed.) Proceedings of the 19th International Conference on Machine Learning, pp. 179–186 (2002)
Flach, P.: The geometry of ROC space: Understanding machine learning metrics through ROC isometrics. In: Proc. of the 20th Int. Conf. on Machine Learning, Morgan Kaufmann, San Francisco (2003)
Fogel, D.B., Wasson, E.C., Boughton, E.M., Porto, V.W., Angeline, P.J.: Linear and neural models for classifying breast cancer. IEEE Trans. Medical Imaging 17(3), 485–488 (1998)
Lucas, N., Azé, J., Sebag, M.: Atherosclerosis risk identification and visual analysis. In: Discovery Challenge ECML-PKDD 2002 (2002), http://lisp.vse.cz/challenge/ecmlpkdd2002/
Ling, C.X., Hunag, J., Zhang, H.: AUC: a better measure than accuracy in comparing learning algorithms. In: Proc. of 16th Canadian Conference on AI 2003 (2003) (to appear)
Mozer, M.C., Dodier, R., Colagrosso, M.C., Guerra-Salcedo, C., Wolniewicz, R.: Prodding the ROC curve: Constrained optimization of classifier performance. In: Advances in Neural Information Processing Systems, vol. 13, The MIT Press, Cambridge (2001)
Provost, F., Fawcett, T., Kohavi, R.: The case against accuracy estimation for comparing classifiers. In: Proc. of the 15th Int. Conf. on Machine Learning, pp. 445–553. Morgan Kaufmann, San Francisco (1998)
Schölkopf, B., Burgess, C., Smola, A.: Advances in Kernel Methods. MIT Press, Cambridge (1998)
Schwefel, H.-P.: Numerical Optimization of Computer Models. John Wiley & Sons, New-York (1981), 2nd edn. (1995)
Shapire, R.E., Freund, Y., Bartlett, P., Lee, W.S.: Boosting the margin: a new explanation for the effectiveness of voting methods. In: Proc. of the 14th Int. Conf. on Machine Learning, pp. 322–330. Morgan Kaufmann, San Francisco (1997)
Srinivasan, A., King, R.D., Bristol, D.W.: An assessment of submissions made to the Predictive Toxicology Evaluation Challenge. In: Proc. of Int. Joint Conf. on Artificial Intelligence, IJCAI 1999, pp. 270–275. Morgan Kaufmann, San Francisco (1999)
Vapnik, V.N.: Statistical Learning Theory. Wiley, Chichester (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sebag, M., Azé, J., Lucas, N. (2004). ROC-Based Evolutionary Learning: Application to Medical Data Mining. In: Liardet, P., Collet, P., Fonlupt, C., Lutton, E., Schoenauer, M. (eds) Artificial Evolution. EA 2003. Lecture Notes in Computer Science, vol 2936. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24621-3_31
Download citation
DOI: https://doi.org/10.1007/978-3-540-24621-3_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21523-3
Online ISBN: 978-3-540-24621-3
eBook Packages: Springer Book Archive