Abstract
The binary logistic regression is a machine learning tool for classification and discrimination that is widely used in business analytics and medical research. Transforming continuous predictors to improve model performance of logistic regression is a common practice, but no systematic method for finding optimal transformations exists in the statistical or data mining literature. In this paper, the problem of selecting transformations of continuous predictors to improve the performance of logistic regression models is considered. The proposed method is based upon the point-biserial correlation coefficient between the binary response and a continuous predictor. Several examples are presented to illustrate the proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
M.H. Kutner, C.J. Nachtsheim, J. Neter, Applied Linear Regression Models, 4th edn. (McGraw-Hill Higher Education, Boston, 2004), pp. 129–141
F.E. Harrell, Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis (Springer Science & Business Media, New York, 2001), pp. 7–10
E.W. Steyerberg, Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating (Springer Science & Business Media, New York, 2008), pp. 57–58
R. Kay, S. Little, Transformations of the explanatory variables in the logistic regression model for binary data. Biomelrika 74(3), 495–501 (1987)
H.C. Kraemer, Correlation coefficients in medical research: from product moment correlation to the odds ratio. Stat. Methods Med. Res. 15, 525–545 (2006)
NCSS Statistical Software Manual, Chapter 302. Point-Biserial and Biserial Correlations. https://ncss-wpengine.netdna-ssl.com/wp-content/themes/ncss/pdf/Procedures/NCSS/Point-Biserial_and_Biserial_Correlations.pdf
F. Guillet, H. Hamilton, J. (eds.), Quality Measures in Data Mining, vol 43 (Springer, New York, 2007)
G. James, D. Witten, T. Hastie, R. Tibshirani, An Introduction to Statistical Learning, vol 6 (Springer, New York, 2013)
D.W. Hosmer Jr., H. Lemeshow, Applied Logistic Regression (Wiley, New York, 2004)
F. Cady, The Data Science Handbook (Wiley, New York, 2017), pp. 118–119
D.M.W. Powers, Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. J. Mach. Learn. Technol. 2(1), 37–63 (2011)
J. Fox, G. Monette, Generalized collinearity diagnostics. J. Am. Stat. Assoc. 87, 178–183 (1992)
E.W. Steyerberg, A.J. Vickers, N.R. Cook, T. Gerds, M. Gonen, N. Obuchowski, M.J. Pencina, M.W. Kattan, Assessing the performance of prediction models: a framework for some traditional and novel measures. Epidemiology 21(1), 128–138 (2010)
M. Bozorgi, K. Taghva, A.K. Singh, Cancer survivability with logistic regression, in Computing Conference 2017, London, July 2017, pp. 18–20
Y. Zhao, R and Data Mining: Examples and Case Studies (Academic Press, London, 2012), pp. 90–92
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Chang, M., Dalpatadu, R.J., Singh, A.K. (2018). Selection of Transformations of Continuous Predictors in Logistic Regression. In: Latifi, S. (eds) Information Technology - New Generations. Advances in Intelligent Systems and Computing, vol 738. Springer, Cham. https://doi.org/10.1007/978-3-319-77028-4_58
Download citation
DOI: https://doi.org/10.1007/978-3-319-77028-4_58
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77027-7
Online ISBN: 978-3-319-77028-4
eBook Packages: EngineeringEngineering (R0)