Székely Regularization for Uplift Modeling

Jaroszewicz, Szymon; Zaniewicz, Łukasz

doi:10.1007/978-3-319-18781-5_8

Székely Regularization for Uplift Modeling

Szymon Jaroszewicz^4,5 &
Łukasz Zaniewicz⁴

Chapter
First Online: 01 January 2015

1919 Accesses
2 Citations

Part of the book series: Studies in Computational Intelligence ((SCI,volume 605))

Abstract

Uplift modeling is a subfield of machine learning concerned with predicting the causal effect of an action at the level of individuals. This is achieved by using two training sets: treatment, containing objects which have been subjected to an action and control, containing objects on which the action has not been performed. An uplift model then predicts the difference between conditional success probabilities in both groups. Uplift modeling is best applied to training sets obtained from randomized controlled trials, but such experiments are not always possible, in which case treatment assignment is often biased. In this paper we present a modification of Uplift Support Vector Machines which makes them less sensitive to such a bias. This is achieved by including in the model formulation an additional term which penalizes models which score treatment and control groups differently. We call the technique Székely regularization since it is based on the energy distance proposed by Székely and Rizzo. Optimization algorithm based on stochastic gradient descent techniques has also been developed. We demonstrate experimentally that the proposed regularization term does indeed produce uplift models which are less sensitive to biased treatment assignment.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
The values of the class variable should not be confused with model predictions defined in (1). For example, a model prediction of \(+1\) means that we expect the class variable to take the value of \(+1\) if the action is performed (\(y^T=+1\)) and to take the value of \(-1\) if the action is not performed (\(y^C=-1\)).

References

Bach F, Moulines E (2011) Non-asymptotic analysis of stochastic approximation algorithms for machine learning. In: Proceedings of advances in neural information processing systems 24 (NIPS 2011)
Google Scholar
Guelman L, Guillén M, Pérez-Marín AM (2012) Random forests for uplift modeling: an insurance customer retention case. In: Modeling and simulation in engineering, economics and management. Lecture notes in business information processing (LNBIP), vol 115. Springer, pp. 123–133
Google Scholar
Holland PW (1986) Statistics and causal inference. J Am Stat Assoc 81(396):945–960
Article Google Scholar
Jaśkowski M, Jaroszewicz S (2012) Uplift modeling for clinical trial data. In: ICML 2012 workshop on machine learning for clinical data analysis, Edinburgh, June 2012
Google Scholar
Jr Connors AF, Speroff T, Dawson NV et al (1996) The effectiveness of right heart catheterization in the initial care of critically ill patients. JAMA 276(11):889–897
Google Scholar
Koronacki J, Ćwik J (2008) Statystyczne systemy ucza̧ce siȩ. Exit, Warsaw (In Polish)
Google Scholar
Kushner HJ, Yin GG (2003) Stochastic approximation and recursive algorithms and applications. Springer
Google Scholar
Kuusisto F, Costa VS, Nassif H, Burnside E, Page D, Shavlik J (2014) Support vector machines for differential prediction. In: ECML-PKDD
Google Scholar
Polyak BT, Juditsky AB (1992) Acceleration of stochastic approximation by averaging. SIAM J Control Optim 30(4):838–855
Article MathSciNet MATH Google Scholar
Radcliffe NJ, Surry PD (1999) Differential response analysis: Modeling true response by isolating the effect of a single action. In: Proceedings of credit scoring and credit control VI. Credit Research Centre, University of Edinburgh Management School
Google Scholar
Radcliffe NJ, Surry PD (2011) Real-world uplift modelling with significance-based uplift trees. Portrait Technical Report TR-2011-1, Stochastic Solutions
Google Scholar
Robins J, Rotnitzky A (2004) Estimation of treatment effects in randomised trials with non-compliance and a dichotomous outcome using structural mean models. Biometrika 91(4):763–783
Article MathSciNet MATH Google Scholar
Rosenbaum PR (1987) Model-based direct adjustment. J Am Stat Assoc 82(398):387–394
Article MATH Google Scholar
Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 70(1):41–55
Article MathSciNet MATH Google Scholar
Rzepakowski P, Jaroszewicz S (2010) Decision trees for uplift modeling. In: Proceedings of the 10th IEEE international conference on data mining (ICDM), Sydney, Australia, pp. 441–450 Dec 2010
Google Scholar
Rzepakowski P, Jaroszewicz S (2012) Decision trees for uplift modeling with single and multiple treatments. Knowl Inf Syst 32:303–327 August
Article Google Scholar
Sołtys M, Jaroszewicz S, Rzepakowski P (2014) Ensemble methods for uplift modeling. Data mining and knowledge discovery, pp. 1–29 (online first)
Google Scholar
Szekely GJ, Rizzo ML (2004) Testing for equal distributions in high dimension. Interstat, Nov 2004
Google Scholar
Szekely GJ, Rizzo ML (2005) Hierarchical clustering via joint between-within distances: extending ward’s minimum variance method. J Classif 22(2):151–183
Article MathSciNet Google Scholar
Szekely GJ, Rizzo ML, Bakirov NK (2007) Measuring and testing dependence by correlation of distances. Ann Stat 35(6):2769–2794
Google Scholar
Vansteelandt S, Goetghebeur E (2003) Causal inference with generalized structural mean models. J R Stat Soc B 65(4):817–835
Article MathSciNet Google Scholar
Zaniewicz L, Jaroszewicz S (2013) Support vector machines for uplift modeling. In: The first IEEE ICDM workshop on causal discovery (CD 2013), Dallas, Dec 2013
Google Scholar

Download references

Acknowledgments

This work was supported by Research Grant no. N N516 414938 of the Polish Ministry of Science and Higher Education (Ministerstwo Nauki i Szkolnictwa Wyższego) from research funds for the period 2010–2014. Ł.Z. was co-funded by the European Union from resources of the European Social Fund. Project POKL ‘Information technologies: Research and their interdisciplinary applications’, Agreement UDA-POKL.04.01.01-00-051/10-00.

Author information

Authors and Affiliations

Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland
Szymon Jaroszewicz & Łukasz Zaniewicz
National Institute of Telecommunications, Warsaw, Poland
Szymon Jaroszewicz

Authors

Szymon Jaroszewicz
View author publications
You can also search for this author in PubMed Google Scholar
Łukasz Zaniewicz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Szymon Jaroszewicz .

Editor information

Editors and Affiliations

Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, Canada
Stan Matwin
Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland, and Warsaw University of Technology, Warsaw, Poland
Jan Mielniczuk

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Jaroszewicz, S., Zaniewicz, Ł. (2016). Székely Regularization for Uplift Modeling. In: Matwin, S., Mielniczuk, J. (eds) Challenges in Computational Statistics and Data Mining. Studies in Computational Intelligence, vol 605. Springer, Cham. https://doi.org/10.1007/978-3-319-18781-5_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-18781-5_8
Published: 28 June 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18780-8
Online ISBN: 978-3-319-18781-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics