Models for Understanding Versus Models for Prediction

Saporta, Gilbert

doi:10.1007/978-3-7908-2084-3_26

Gilbert Saporta²

1220 Accesses

Abstract

According to a standard point of view, statistical modelling consists in establishing a parsimonious representation of a random phenomenon, generally based upon the knowledge of an expert of the application field: the aim of a model is to provide a better understanding of data and of the underlying mechanism which have produced it. On the other hand, Data Mining and KDD deal with predictive modelling: models are merely algorithms and the quality of a model is assessed by its performance for predicting new observations. In this communication, we develop some general considerations about both aspects of modelling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

BCBS (2005): Studies on the Validation of Internal Rating Systems, Basel Com- mittee on Banking Supervision, Bank of International Settlements, http://www.bis.org/publ/bcbs_wp14.htm
BERKSON, J. (1980): Minimum chi-square, not maximum likelihood! Annals of Mathematical Statistics 8, 457-487.
MATH MathSciNet Google Scholar
BESSE, P., CAUSSINUS, H., FERRÉ, L. and FINE, J. (1988): Principal Components Analysis and Optimization of Graphical Displays, Statistics, 19, 301–312.
Article MATH MathSciNet Google Scholar
BORRA, S. and Di CIACCIO, A.(2007): Measuring the prediction error. A comparison of cross-validation, bootstrap and hold-out methods, in Ferreira, C., Lauro, C., Saporta, G. and Souto de Miranda, M. (eds), Proceedings IASC 07, Aveiro, Portugal
Google Scholar
BOX, G.E.P. and DRAPER, N.R. (1987): Empirical Model-Building and Response Surfaces, Wiley
Google Scholar
BURNHAM, K.P. and ANDERSON, D.R. (2000): Model Selection and Inference, Springer
Google Scholar
CHERKASSKY, V. and MULIER, F. (1998): Learning from data, Wiley
Google Scholar
DEVROYE, L., GYÖRFI L. and LUGOSI, G. (1996): A Probabilistic Theory of Pattern Recognition, Springer
Google Scholar
HAND, D.J. (2000): Methodological issues in data mining, in J.G. Bethlehem and P.G.M. van der Heijden (eds), Compstat 2000 : Proceedings in Computational Statistics, Physica-Verlag, 77-85
Google Scholar
HASTIE, T., TIBSHIRANI, F. and FRIEDMAN, J. (2001): Elements of Statistical Learning, Springer
Google Scholar
HATABIAN, G. and SAPORTA, G. (1986): Régions de confiance en analyse factorielle, in Diday E. (ed) Data Analysis and Informatics IV, North-Holland, 499-508
Google Scholar
LEBART, L. (2006): Validation Techniques in Multiple Correspondence Analysis, in Greenacre M. and Blasius J. (eds) Multiple Correspondence Analysis and related techniques, Chapman and Hall/CRC, 179-196
Google Scholar
NIANG, N. and SAPORTA, G. (2007): Resampling ROC curves, in Ferreira, C., Lauro, C., Saporta, G. and Souto de Miranda, M. (eds), Proceedings IASC 07, Aveiro, Portugal
Google Scholar
VAPNIK, V. (2006): Estimation of Dependences Based on Empirical Data, 2nd edition, Springer
Google Scholar

Download references

Author information

Authors and Affiliations

Chaire de statistique appliquée & CEDRIC, CNAM, 292 rue Saint Martin, Paris, France
Gilbert Saporta

Authors

Gilbert Saporta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gilbert Saporta .

Editor information

Editors and Affiliations

Faculdade de Economia, Rua Dr. Roberto Frias, 4200-464, Porto, Portugal
Paula Brito

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saporta, G. (2008). Models for Understanding Versus Models for Prediction. In: Brito, P. (eds) COMPSTAT 2008. Physica-Verlag HD. https://doi.org/10.1007/978-3-7908-2084-3_26

Download citation

DOI: https://doi.org/10.1007/978-3-7908-2084-3_26
Publisher Name: Physica-Verlag HD
Print ISBN: 978-3-7908-2083-6
Online ISBN: 978-3-7908-2084-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics