Abstract
During statistical analysis of clinic data, missing data is a common challenge. Incomplete datasets can occur via different means, such as mishandling of samples, low signal-to-noise ratio, measurement error, non-responses to questions, or aberrant value deletion. Missing data causes severe problems in statistical analysis and leads to invalid conclusions. Multiple imputation is a useful strategy for handling missing data. The statistical inference of multiple imputation is widely accepted as a less biased and more valid result. In the chapter, we apply the multiple imputation to a public-accessible heart disease dataset, which has a high missing rate, and build a prediction model for the heart disease diagnosis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cary, N. (2015). SAS/STAT® 14.1 User’s Guide. Cary, NC: SAS Institute Inc.
Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60, 549–576.
Kang, H. (2013). The prevention and handling of the missing data. Korean Journal of Anesthesiology, 64, 402–406.
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581–592.
Sarkar, S. K., Midi, H., & Rana, S. (2011). Detection of outliers and influential observations in binary logistic regression: An empirical study. Journal of Applied Sciences, 11(1), 26–35.
Sterne, J. A., White, I. R., Carlin, J. B., Spratt, M., Royston, P., Kenward, M. G., Wood, A. M., & Carpenter, J. R. (2009). Multiple imputation for missing data in epidemiological and clinical research: Potential and pitfalls. BMJ, 338, b2393.
Tanner, M. A. & Wong W. H. (1987). Source: Journal of the American Statistical Association, 82(398), 528–540.
Van Buuren, S., Brand, J. P. L., Groothuis-Oudshoorn, C. G. M., & Rubin, D. B. (2006). Fully conditional specification in multivariate imputation. Journal of Statistical Computation and Simulation, 76, 1049–1064.
Von Hippel, P. T. (2009). How to impute interactions, squares, and other transformed variables. Sociological Methodology, 39, 265–291.
Zhang, P. (2003). Multiple imputation: Theory and method. International Statistical Review, 71, 581–592.
Acknowledgements
The authors are grateful to the two reviewers for their helpful comments, which improved the manuscript significantly. The authors would like to thank Lisa Elon for invaluable advice and Dr. Eric Dammer for critical reading of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Li, L., Zhao, Y. (2018). Statistical Modeling for the Heart Disease Diagnosis via Multiple Imputation. In: Zhao, Y., Chen, DG. (eds) New Frontiers of Biostatistics and Bioinformatics. ICSA Book Series in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-99389-8_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-99389-8_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99388-1
Online ISBN: 978-3-319-99389-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)