Abstract
Supervised classification methods have been the focus of a vast amount of research in recent decades, within a variety of intellectual disciplines, including statistics, machine learning, pattern recognition, and data mining. Highly sophisticated methods have been developed, using the full power of recent advances in computation. Many of these methods would have been simply inconceivable to earlier generations. However, most of these advances have largely taken place within the context of the classical supervised classification paradigm of data analysis. That is, a classification rule is constructed based on a given ‘design sample’ of data, with known and well-defined classes, and this rule is then used to classify future objects. This paper argues that this paradigm is often, perhaps typically, an over-idealisation of the practical realities of supervised classification problems. Furthermore, it is also argued that the sequential nature of the statistical modelling process means that the large gains in predictive accuracy are achieved early in the modelling process. Putting these two facts together leads to the suspicion that the apparent superiority of the highly sophisticated methods is often illusory: simple methods are often equally effective or even superior in classifying new data points.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Adams, N. M., and Hand, D. J. (1999). “Comparing Classifiers When the Misallocation Costs are Uncertain,” Pattern Recognition, 32, 1139–1147.
Benton, T. C. (2002). “Theoretical and Empirical Models,” Ph.D. dissertation, Department of Mathematics, Imperial College London, UK.
Blake, C, and Merz, C. J. (1998). UCI Repository of Machine Learning Databases [www.ics.uci.edu/mlearn/MLRepository.html], Irvine, CA: University of California, Department of Information and Computer Science.
Brodley, C. E., and Smyth, P. (1997). “Applying Classification Algorithms in Practice,” Statistics and Computing, 7, 45–56.
Cannan, E. (1892). “The Origin of the Law of Diminishing Returns,” Economic Journal, 2, 1813–1815.
Fisher, R. A. (1936). “The Use of Multiple Measurements in Taxonomic Problems,” Annals of Eugenics, 7, 179–184.
Friedman, J. H. (1997). On Bias, Variance, 0/1 Loss, and the Curse of Dimensionality,” Data Mining and Knowledge Discovery, 1, 55–77.
Gallagher, J. C, Hedlund, L. R., Stoner, S., and Meeger, C. (1988). “Vertebral Morphometry: Normative Data,” Bone and Mineral, 4, 189–196.
Hand, D. J. (1981). Discrimination and Classification. Chichester: Wiley.
Hand, D. J. (1986). “Recent Advances in Error Rate Estimation,” Pattern Recognition Letters, 4, 335–346.
Hand, D. J. (1987). “Screening Versus Prevalence Estimation,” Applied Statistics, 36, 1–7.
Hand, D. J. (1996). “Classification and Computers: Shifting the Focus,” in COMPSTAT-Proceedings in Computational Statistics, 1996, ed. A. Prat, Physica-Verlag, pp. 77–88.
Hand, D. J. (1997). Construction and Assessment of Classification Rules. Chichester: Wiley.
Hand, D. J. (1998). “Strategy, Methods, and Solving the Right Problem,” Computational Statistics, 13, 5–14.
Hand, D. J., (1999). “Intelligent Data Analysis and Deep Understanding,” in Causal Models and Intelligent Data Management, ed. A. Gammerman, Springer-Verlag, pp. 67–80.
Hand, D. J. (2001). “Measuring Diagnostic Accuracy of Statistical Prediction Rules,” Statistica Neerlandica, 53, 3–16.
Hand, D. J. (2001b). “Modelling Consumer Credit Risk,” IMA Journal of Management Mathematics, 12, 139–155.
Hand, D. J. (2001c). “Reject Inference in Credit Operations,” in it Handbook of Credit Scoring, ed. E. Mays, Chicago: Glenlake Publishing, pp. 225–240.
Hand, D. J. (2003). “Supervised Classification and Tunnel Vision,” Technical Report, Department of Mathematics, Imperial College London.
Hand, D. J. (2003b). “Good Practice in Retail Credit Scorecard Assessment,” Technical Report, Department of Mathematics, Imperial College London.
Hand, D. J. (2003c). “Pattern Recognition,” to appear in Handbook of Statistics, ed. E. Wegman.
Hand D. J. and Henley W.E. (1997). “Statistical Classification Methods in Consumer Credit Scoring: A Review,” Journal of the Royal Statistical Society, Series A, 160, 523–541.
Hand, D. J. and Vinciotti, V. (2003). “Local Versus Global Models for Classification Problems: Fitting Models Where It Matters,” The American Statistician, 57, 124–131.
Heckman, J. (1976). “The Common Structure of Statistical Models of Truncation, Sample Selection and Limited Dependent Variables, and a Simple Estimator for Such Models,” Annals of Economic and Social Measurement, 5, 475–492.
Holte, R. C. (1993). “Very Simple Classification Rules Perform Well on Most Commonly Used Datasets,” Machine Learning, 11, 63–91.
Kelly, M. G, and Hand, D. J. (1999). “Credit Scoring with Uncertain Class Definitions,” IMA Journal of Mathematics Applied in Business and Industry, 10, 331–345.
Kelly, M. G., Hand, D. J., and Adams, N. M. (1998). “Defining the Goals to Optimise Data Mining Performance,” in Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, ed. R. Agrawal, P. Stolorz, and G. Piatetsky-Shapiro, Menlo Park: AAAI Press, pp. 234–238.
Kelly, M. G., Hand, D. J., and Adams, N. M. (1999). “The Impact of Changing Populations on Classifier Performance,” Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ed. S. Chaudhuri and D. Madigan, Association for Computing Machinery, New York, pp. 367–371.
Kelly, M. G., Hand, D. J., and Adams, N. M. (1999b). “Supervised Classification Problems: How to be Both Judge and Jury,” in Advances in Intelligent Data Analysis, ed. D. J. Hand, J. N. Kok, and M. R. Berthold, Springer, Berlin, pp. 235–244.
Lane, T. and Brodley, C. E. (1998). “Approaches to Online Learning and Concept Drift for User Identification in Computer Security,” in Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, ed. R. A. Agrawal, P. Stolorz, and G. Piatetsky-Shapiro, AAAI Press, Menlo Park, California, pp. 259–263.
Lewis, E. M. (1994). An Introduction to Credit Scoring, San Rafael, California: Athena Press.
Li, H. G. and Hand, D. J. (2002). “Direct Versus Indirect Credit Scoring Classifications,” Journal of the Operational Research Society, 53, 1–8.
Mingers, J. (1989). “An Empirical Comparison of Pruning Methods for Decision Tree Induction,” Machine Learning, 4, 227–243.
Rendell, L. and Sechu, R. (1990). “Learning Hard Concepts Through Construcive Induction,” Computational Intelligence, 6, 247–270.
Ripley, B. D. (1996). Pattern Recognition and Neural Networks, Cambridge University Press, Cambridge.
Rosenberg, E. and Gleit, A. (1994). “Quantitative Methods in Credit Management: A Survey,” Operations Research, 42, 589–613.
Schiavo, R. and Hand, D. J. (2000). “Ten More Years of Error Rate Research,” International Statistical Review, 68, 295–310.
Shavlik, J., Mooney, R. J., and Towell, G. (1991). “Symbolic and Neural Learning Algorithms: An Experimental Comparison,” Machine Learning, 6, 111–143.
Thomas, L. C. (2000). “A Survey of Credit and Behavioural Scoring: Forecasting Financial Risk of Lending to Consumers,” International Journal of Forecasting, 16, 149–172.
Webb, A. (2002). Statistical Pattern Recognition, 2nd ed. Chichester: Wiley.
Weiss, S. M., Galen, R. S., and Tadepalli, P. V. (1990). “Maximizing the Predictive Value of Production Rules,” Artificial Intelligence, 45, 47–71.
Widmer, G. and Kubat, M. (1996). “Learning in the Presence of Concept Drift and Hidden Contexts,” Machine Learning, 23, 69–101.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hand, D.J. (2004). Academic Obsessions and Classification Realities: Ignoring Practicalities in Supervised Classification. In: Banks, D., McMorris, F.R., Arabie, P., Gaul, W. (eds) Classification, Clustering, and Data Mining Applications. Studies in Classification, Data Analysis, and Knowledge Organisation. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17103-1_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-17103-1_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22014-5
Online ISBN: 978-3-642-17103-1
eBook Packages: Springer Book Archive