# A predictive modeling approach to increasing the economic effectiveness of disease management programs

- 562 Downloads
- 3 Citations

## Abstract

Predictive Modeling (PM) techniques are gaining importance in the worldwide health insurance business. Modern PM methods are used for customer relationship management, risk evaluation or medical management. This article illustrates a PM approach that enables the economic potential of (cost-)effective disease management programs (DMPs) to be fully exploited by optimized candidate selection as an example of successful data-driven business management. The approach is based on a Generalized Linear Model (GLM) that is easy to apply for health insurance companies. By means of a small portfolio from an emerging country, we show that our GLM approach is stable compared to more sophisticated regression techniques in spite of the difficult data environment. Additionally, we demonstrate for this example of a setting that our model can compete with the expensive solutions offered by professional PM vendors and outperforms non-predictive standard approaches for DMP selection commonly used in the market.

## Keywords

Health insurance Selection for disease management programs Predictive modeling Generalized linear model Comparison of methods## Notes

### Acknowledgments

We would like to thank the health insurance company concerned for providing us claims data and the three PM vendors for participating in the test.

## References

- 1.Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723CrossRefGoogle Scholar
- 2.Antonio K, Beirlant J (2007) Actuarial statistics with generalized linear mixed models. Insur Math Econ 40(1):58–76CrossRefGoogle Scholar
- 3.Belitz C, Brezger A, Kneib T, Lang S (2009) BayesX—software for Bayesian inference in structured additive regression models, version 2.0.1. Erhältlich unter: http://www.stat.uni-muenchen.de/~bayesx
- 4.Billings J, Mijanovich T (2007) Improving the management of care for high-cost medicaid patients. Health Aff 26(6):1643–1655CrossRefGoogle Scholar
- 5.Blough DK, Madden CW, Hornbrook MC (1999) Modeling risk using generalized linear models. J Health Econ 18:153–171CrossRefGoogle Scholar
- 6.Bodenheimer T, Lorig K, Holman H, Grumbach K (2002) Patient self-management of chronic disease in primary care. J Am Med Assoc 288(19):2469–2475CrossRefGoogle Scholar
- 7.Breiman L (1984) Classification and regression trees. Chapman & Hall/CRC, LondonGoogle Scholar
- 8.Buntin MB, Zaslavsky AM (2004) Too much ado about two-part models and transformation? Comparing methods of modeling Medicare expenditures. J Health Econ 23:525–542CrossRefGoogle Scholar
- 9.Cameron AC, Trivedi PK (1998) Regression analysis of count data. Cambridge University Press, New YorkCrossRefGoogle Scholar
- 10.Davison AC (2003) Statistical models. Cambridge University Press, New YorkCrossRefGoogle Scholar
- 11.De Jong P, Heller GZ (2008) Generalized linear models for insurance data. Cambridge University Press, CambridgeCrossRefGoogle Scholar
- 12.Diehr P, Yanez D, Ash A, Hornbrook M, Lin DY (1999) Methods for analyzing health care utilization and costs. Ann Rev Public Health 20:125–144CrossRefGoogle Scholar
- 13.Duan N, Manning WG, Morris CN, Newhouse JP (1983) A comparison of alternative models for the demand for medical care. J Bus Econ Stat 1(2):115–126Google Scholar
- 14.Fahrmeir L, Kneib T (2010) Bayesian smoothing and regression for longitudinal, spatial and event history data. Oxford University Press, LondonGoogle Scholar
- 15.Francis L (2001) Neural networks demystified. Tech. rep. Casualty actuarial society forum. Available at: http://casualtyactuarialsociety.com/pubs/forum/01wforum/01wf253.pdf
- 16.Francis L (2003) Martian chronicles: is MARS better than neural networks? Tech. rep. Casualty actuarial society forum. Available at: http://casualtyactuarialsociety.com/pubs/forum/03wforum/03wf027.pdfGoogle Scholar
- 17.Freeman R, Lybecker KM, Taylor DW (2011) The effectiveness of disease management programs in the medicaid population. Tech. rep. The Cameron Institute. Available at: http://cameroninstitute.com
- 18.Frees EW, Valdez EA (2008) Hierarchical insurance claims modeling. J Am Stat Assoc 103(484):1457–1469CrossRefGoogle Scholar
- 19.Frees EW, Young VR, Luo Y (1999) A longitudinal data analysis interpretation of credibility models. Insur Math Econ 24:229–247CrossRefGoogle Scholar
- 20.Freitag AA (2002) Data mining and knowledge discovery with evolutionary algorithms. Springer Verlag, BerlinCrossRefGoogle Scholar
- 21.Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19(1):1–141CrossRefGoogle Scholar
- 22.Good PI (2005) Introduction to statistics through resampling methods and R/S-PLUS. Wiley, New YorkCrossRefGoogle Scholar
- 23.Haberman S, Renshaw AE (1998) Actuarial applications of generalized linear models. In: Hand D, Jacka S (eds) Statistics in finance. Arnold, E., LondonGoogle Scholar
- 24.IDF: International Diabetes Federation (2011) http://atlas.idf-bxl.org/content/economic-impacts-diabetes and http://www.idf.org/node/23640
- 25.Inglis SC, Clark RA, McAlister FA, Ball J, Lewinter C, Cullington D, Stewart S, Cleland JGF (2010) Structured telephone support or telemonitoring programmes for patients with chronic heart failure. Cochrane Database Syst Rev 2010 8:CD007228Google Scholar
- 26.Kolyshkina I, Wong SSW, Lim S (2004) Enhancing generalised linear models with data mining. Discussion paper. Casualty actuarial society. Arlington, Virginia. Available at: http://www.casact.org/pubs/dpp/dpp04/04dpp279.pdf
- 27.Lamers LM (1999) A risk-adjuster for capitation payments based on the use of prescribed drugs. Med Care 37:824–830CrossRefGoogle Scholar
- 28.Lamers LM (2004) AIC and BIC—comparisons of assumptions and performance. Sociol Methods Res 33(2):188–229CrossRefGoogle Scholar
- 29.Liang KY, Zeger S (1986) GEE estimators. Biometrika 73(1):13–22CrossRefGoogle Scholar
- 30.Lorig KR, Ritter P, Stewart AL, Sobel DS, Brown WB, Bandura A, Gonzalez VM, Laurent DD, Holman HR (2001) Chronic disease self-management program: 2-year health status and health care utilization outcomes. Med Care 39(11):1217–1223CrossRefGoogle Scholar
- 31.MacKay D (2003) Information theory, inference and learning algorithms. Cambridge University Press, CambridgeGoogle Scholar
- 32.Manning WG (1998) The logged dependent variable, heteroscedasticity, and the retransformation problem. J Health Econ 17:283–295CrossRefGoogle Scholar
- 33.Manning WG, Mullahy J (2001) Estimating log models: to transform or not to transform? J Health Econ 20:461–494CrossRefGoogle Scholar
- 34.McCullagh P, Nelder JA (1989) Generalized linear models. Chapman & Hall / CRC, LondonCrossRefGoogle Scholar
- 35.McCulloch CE, Searle SR (2001) Generalized, linear, and mixed models. Wiley, New YorkGoogle Scholar
- 36.Mehmud S, Winkelman R (2007) A comparative analysis of claims-based tools for health risk assessment. Tech. rep. Society of Actuaries. Available at: http://www.soa.org/files/pdf/risk-assessmentc.pdf
- 37.Meyer J, Smith BM (2008) Chronic disease management: evidence of predictable savings. Tech. rep. Health management associates. Available at: http://www.idph.state.ia.us/hcr_committees/common/pdf/clinicians/savings_report.pdf
- 38.Miller AJ (1990) Subset selection in regression. Chapman and Hall, New YorkCrossRefGoogle Scholar
- 39.Mullahy J (1998) Much ado about two: reconsidering retransformation and the two-part model in health econometrics. J Health Econ 17:247–281CrossRefGoogle Scholar
- 40.Newhouse JP, Manning WG, Keeler EB, Sloss EM (1989) Adjusting capitation rates using objective health measures and prior utilization. Health Care Financ R 10(3):41–54Google Scholar
- 41.Nugent R (2008) Chronic diseases in developing countries: health and economic burdens. Ann N Y Acad Sci 1136:70–79CrossRefGoogle Scholar
- 42.Pearce J, Ferrier S (2000) Evaluating the predictive performance of habitat models developed using logistic regression. Ecol Model 133:225–245CrossRefGoogle Scholar
- 43.Powers CA, Meyer CM, Roebuck MC, Vaziri B (2005) Predictive modeling of total healthcare costs using pharmacy claims data: a comparison of alternative econometric cost modeling techniques. Med Care 43(11):1065–1072CrossRefGoogle Scholar
- 44.R Development Core Team (2009) R: a language and environment for statistical computing. R foundation for statistical computing. Vienna, Austria. http://www.R-project.org. ISBN 3-900051-07-0
- 45.Ripley BD (1996) Pattern recognition and neural networks. Cambridge University Press, CambridgeCrossRefGoogle Scholar
- 46.Schwarz GE (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464CrossRefGoogle Scholar
- 47.Tutz G (2000) Analyse kategorialer Daten. Oldenbourg Verlag, MunichGoogle Scholar
- 48.Tutz G, Fahrmeir L (2001) Multivariate statistical modelling based on generalized linear models. Springer, New YorkGoogle Scholar
- 49.Veazie PJ, Manning WG, Kane RL (2003) Improving risk adjustment for medicare capitated reimbursement using nonlinear models. Med Care 41(6):741–752Google Scholar
- 50.Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn.Springer, BerlinCrossRefGoogle Scholar
- 51.Viaene S, Derrig RA, Baesens B, Dedene G (2002) A comparison of state-of-the-art classification techniques for expert automobile insurance claim fraud detection. J Risk Insur 69(3):373–421CrossRefGoogle Scholar
- 52.Wedderburn RWM (1974) Quasi-likelihood functions, generalized linear models and the Gauss-Newton method. Environ Res 104:402–409Google Scholar
- 53.Yau KW, Lee AH, Ng ASK (2002) A zero-augmented gamma mixed model for longitudinal data with many zeros. Aust N Z J Stat 44(2):177–183CrossRefGoogle Scholar