Abstract
Model building and scoring as a statistical methodology have been known for decades, and there is a wide variety of literature available for studies. Instead of giving a complete introduction into model building and scoring techniques, it is the intention of this chapter to explain the main predictive modeling techniques from an angle which allows the reader to understand the change in paradigm that comes with the transition from classical scores to net scores. At first, the problem to be solved is explained and formalized. The second section introduces common methods for scoring, like decision trees or (logistic) regression, always with the generalization to net scoring in mind. The third section contains an introduction to well-known quality measures for scoring models. Although the facts presented in this chapter may be known to many readers, it is nevertheless recommended to study this chapter in order to get familiar with the way scoring methods are presented and described in this book.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In direct marketing, tracking behavioral data is considered more and more important and particular efforts are dedicated to get and transmit as much of this data as possible. Many mobile devices allow the transmission of positioning data (Is the customer next to a store?), video or acoustic data, or information on websites visited. Loyalty cards enable the provider to assign IDs to customers in order to track their purchase behavior over an extended period across different channels, stores, or companies, even if the customer is paying cash.
- 2.
It is important to emphasize that only observations where the event could have occurred are relevant. Customers holding a certain product may be able to buy it again, but bank customers without credit will not default, and males will not get pregnant.
- 3.
A model may be trained to predict responses in May from March data. This data could be split into training and validation datasets. Performance indicators could then be taken from deploying the model on April data, where they would predict responses for June. The application to data from a different time slice ensures a very honest evaluation of the model quality, however, may also be subject to seasonal effects.
- 4.
An example: Data from external providers about creditworthiness, social atlases, etc. may result in better models without breaking even with their cost.
- 5.
The little subscript on \(\chi _1^2\) refers to a χ 2 distribution with one degree of freedom.
- 6.
Each point somehow “pulls” the line a little bit towards itself.
References
S.F. Crone, S. Lessmann, and R. Stahlbock. The impact of preprocessing on data mining - an evaluation of classifier sensitivity in direct marketing. European Journal of Operational Research, 173(3):781–800, 2006.
W. Daniel. Biostatistics - A Foundation for Analysis in the Health Sciences, Eighth Edition. Wiley, 2005.
M. Falk, F. Marohn, and B. Tewes. Foundations of Statistical Analyses - Examples with SAS. Birkhäuser, Basel, 2003.
J. Han and M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann, Elsevier, San Francisco, 2006.
R. Johnson and G. Bhattacharyya. Statistics - Principles and Methods, 4th edition. Wiley, 2001.
K. Larsen. Net lift models. 2010. Presentation at the Analytics 2010 Conference, available at: http://www.sas.com/events/aconf/2010/pres/larsen.pdf.
O. Marban, G. Mariscal, and J. Segovia. A data mining & knowledge discovery process model. In Data Mining and Knowledge Discovery in Real Life Applications, Book edited by: Julio Ponce and Adem Karahoca, pages 438–453, 2009.
N.J. Radcliffe. Using control groups to target on predicted lift: Building and assessing uplift models. Direct Marketing Journal, 1:14–21, 2007.
N.J. Radcliffe and P.D. Surry. Quality measures for uplift models. 2011. Working paper. http://stochasticsolutions.com/pdf/kdd2011late.pdf.
SAS. Data Mining Using SAS Enterprise Miner: A Case Study Approach. SAS Institute Inc., Cary, 3rd edition, 2013.
E. Siegel. Predictive Analytics: The Power to Predict who will Click, Lie or Die. John Wiley & Sons, 2015.
T. Wang, Z. Qin, Z. Jin, and S. Zhang. Handling over-fitting in test cost-sensitive decision tree learning by feature selection, smoothing and pruning. Journal of Systems and Software, 83(7):1137–1147, 2010.
C. Weiss. Datenanalyse und Modellierung mit STATISTICA. Oldenbourg, Munich, 2007.
S. Zhang. Cost-sensitive classification with respect to waiting cost. Knowledge-Based Systems, 23(5):369–378, 2010.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Michel, R., Schnakenburg, I., von Martens, T. (2019). The Traditional Approach: Gross Scoring. In: Targeting Uplift. Springer, Cham. https://doi.org/10.1007/978-3-030-22625-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-22625-1_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22624-4
Online ISBN: 978-3-030-22625-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)