Modeling and Analysis of Cost Data

Chen, Shizhe; Zhou, XH Andrew

doi:10.1007/978-1-4939-8715-3_31

Shizhe Chen⁸ &
XH Andrew Zhou^9,10

Part of the book series: Health Services Research ((HEALTHSR))

1787 Accesses

Abstract

Cost has become an important outcome in health services research. It can be used not only as a measure for health care spending but also as a measure for a part of health care value. Given ever-increasing rising health care expenditure, the value of health care should include not only traditional measures, such as mortality and morbidity, but also the cost of health care. Due to a limited resource, a new treatment with a slightly better efficacy but much higher cost than an existing treatment may not be a choice of a treatment for a patient. Hence, it is important to be able to approximately analyze cost data. However, appropriately analyzing health care costs may be hindered by special distribution features of cost data, including skewness, zero values, clusters, heteroscedasticity, and multimodality.

Over the decades, various methods have been proposed to address these features. This chapter would be devoted in introducing methods that are able to provide relatively trustworthy results with acceptable efficiency, covering topics on mean inference, regression, and prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 649.99; Price excludes VAT (USA)

Hardcover Book: USD 899.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ai C, Norton EC. Standard errors for the retransformation problem with heteroscedasticity. J Health Econ. 2000;19(5):697–718.
Article PubMed CAS Google Scholar
Aitchison J. On the distribution of a positive random variable having a discrete probability mass at the origin. J Am Stat Assoc. 1955;50(271):901–8.
Google Scholar
Blough DK, Madden CW, Hornbrook MC. Modeling risk using generalized linear models. J Health Econ. 1999;18(2):153–71.
Article PubMed CAS Google Scholar
Box GEP. Science and statistics. J Am Stat Assoc. 1976;71(356):791–9.
Article Google Scholar
Briggs A, Nixon R, Dixon S, Thompson S. Parametric modelling of cost data: some simulation evidence. Health Econ. 2005;14(4):421–8.
Article PubMed Google Scholar
Callahan CM, Kesterson JG, Tierney WM, et al. Association of symptoms of depression with diagnostic test charges among older adults. Ann Intern Med. 1997;126(6):426.
Article PubMed CAS Google Scholar
Yea-Hung Chen and Xiao-Hua Zhou. Interval estimates for the ratio and difference of two lognormal means. Stat Med, 25(23):4099–4113, 2006. ISSN 1097-0258. https://doi.org/10.1002/sim.2504.
Article PubMed Google Scholar
Dominici F, Cope L, Naiman DQ, Zeger SL. Smooth quantile ratio estimation. Biometrika. 2005;92(3):543–57.
Article Google Scholar
Duan N. Smearing estimate: a nonparametric retransformation method. J Am Stat Assoc. 1983;78(383):605–10. ISSN 01621459. URL http://www.jstor.org/stable/2288126
Article Google Scholar
Efron B. Nonparametric estimates of standard error: the jackknife, the bootstrap and other methods. Biometrika. 1981;68(3):589–99.
Article Google Scholar
Fisher RA. The fiducial argument in statistical inference. Ann Hum Genet. 1935;6(4):391–8.
Google Scholar
Friedman J, Hastie T, Tibshirani R. The elements of statistical learning, volume 1. Springer Series in Statistics. 2001.
Google Scholar
Gupta RC, Li X. Statistical inference for the common mean of two log-normal distributions and some applications in reliability. Comput stat data anal. 2006;50(11):3141–64.
Article Google Scholar
Hall P. On the removal of skewness by transformation. J R Stat Soc Ser B Methodol. 1992;54(1):221–8.
Google Scholar
Hannig J, Iyer H, Patterson P. Fiducial generalized confidence intervals. J Am Stat Assoc. 2006;101(473):254–69. https://doi.org/10.1198/016214505000000736.
Article CAS Google Scholar
Hayashi F. Econometrics, vol. volume 1. Princeton: Princeton University Press; 2000.
Google Scholar
Koenker R. Quantreg: quantile regression. R package version, 4. 2009.
Google Scholar
Koenker R, Hallock KF. Quantile regression. J Econ Perspect. 2001;15(4):143–56.
Article Google Scholar
Krishnamoorthy K, Mathew T. Inferences on the means of lognormal distributions using generalized p-values and generalized confidence intervals. J stat plann infer. 2003;115(1):103–21.
Article Google Scholar
Land CE. An evaluation of approximate confidence interval estimation methods for lognormal means. Technometrics. 1972;14(1):145–58.
Article Google Scholar
Manning WG, Mullahy J. Estimating log models: to transform or not to transform? J Health Econ. 2001;20(4):461–94.
Article PubMed CAS Google Scholar
Manning WG, Basu A, Mullahy J. Generalized modeling approaches to risk adjustment of skewed outcomes data. J Health Econ. 2005;24(3):465–88.
Article PubMed Google Scholar
Manning WG. The logged dependent variable, heteroscedasticity, and the retransfor-mation problem. J Health Econ. 1998;17(3):283–95. ISSN 0167-6296. URL http://ukpmc.ac.uk/abstract/MED/10180919
Article PubMed CAS Google Scholar
McCullagh P, Nelder JA. Generalized linear models. Boca Raton: Chapman & Hall/CRC; 1989.
Book Google Scholar
McLachlan GJ, Peel D. Finite mixture models, vol. volume 299. Hoboken: Wiley-Interscience; 2000.
Book Google Scholar
Owen WJ, DeRouen TA. Estimation of the mean for lognormal data containing zeroes and left-censored values, with applications to the measurement of worker exposure to air contaminants. Biometrics. 1980;36(4):707–19. ISSN 0006341X. URL http://www.jstor.org/stable/2556125
Article Google Scholar
Seber GAF, Lee AJ. Linear regression analysis, vol. volume 936. Hoboken: Wiley; 2012.
Google Scholar
Tian L, Wu J. Confidence intervals for the mean of lognormal data with excess zeros. Biom J. 2006;48(1):149–56.
Article PubMed Google Scholar
Lili Tian. Inferences on the mean of zero-inflated lognormal data: the generalized variable approach. Stat Med, 24(20):3223–3232, 2005. ISSN 1097-0258. https://doi.org/10.1002/sim.2169.
Article PubMed Google Scholar
Tsui K-W, Weerahandi S. Generalized p-values in significance testing of hypotheses in the presence of nuisance parameters. J Am Stat Assoc. 1989;84(406):602–7. ISSN 01621459. URL http://www.jstor.org/stable/2289949
Google Scholar
Weerahandi S. Generalized confidence intervals. J Am Stat Assoc. 1993;88(423):899–905. ISSN 01621459. URL http://www.jstor.org/stable/2290779
Article Google Scholar
Weisberg S. Applied linear regression, volume 528. Wiley; 2005.
Google Scholar
Welsh AH, Zhou XH. Estimating the retransformed mean in a heteroscedastic two-part model. J stat plann infer. 2006;136(3):860–81.
Article Google Scholar
Wu J, Wong ACM, Jiang G. Likelihood-based confidence intervals for a log-normal mean. Stat Med. 2003;22(11):1849–60.
Article PubMed Google Scholar
Zhou XH. Estimation of the log-normal mean. Stat Med. 1998;17(19):2251–64.
Article PubMed CAS Google Scholar
Zhou XH, Gao S. Confidence intervals for the log-normal mean. Stat Med. 1997;16(7):783–90.
Article PubMed CAS Google Scholar
Zhou XH, Gao S. One-sided confidence intervals for means of positively skewed distributions. Am Stat. 2000:100–4.
Google Scholar
Zhou XH, Tu W. Comparison of several independent population means when their samples contain log-normal and possibly zero observations. Biometrics. 1999;55(2):645–51.
Article Google Scholar
Zhou XH, Tu W. Confidence intervals for the mean of diagnostic test charge data containing zeros. Biometrics. 2000;56(4):1118–25.
Article PubMed CAS Google Scholar
Zhou XH, Lin H, Johnson E. Non-parametric heteroscedastic transformation regression models for skewed data with an application to health care costs. J R Stat Soc Ser B Stat Methodol. 2008;70(5):1029–47.
Article Google Scholar
Zhou X-H, Gao S, Hui SL. Methods for comparing the means of two independent log-normal samples. Biometrics. 1997;53(3):1129–35. ISSN 0006341X. URL http://www.jstor.org/stable/2533570
Article PubMed CAS Google Scholar

Download references

Author information

Authors and Affiliations

Department of Biostatistics, University of Washington, Seattle, WA, USA
Shizhe Chen
Beijing International Center for Mathematical Research, Peking University, Beijing, China
XH Andrew Zhou
VA Puget Sound Healthcare System, University of Washington, Seattle, WA, USA
XH Andrew Zhou

Authors

Shizhe Chen
View author publications
You can also search for this author in PubMed Google Scholar
XH Andrew Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to XH Andrew Zhou .

Editor information

Editors and Affiliations

Community Health and Epidemiology, Dalhousie University, Halifax, NS, Canada
Adrian Levy
ICON plc, Vancouver, BC, Canada
Sarah Goring
Department of Biostatistics, Brown University, Providence, RI, USA
Constantine Gatsonis
University of British Columbia, Vancouver, BC, Canada
Boris Sobolev
European Observatory on Health Systems and Policies, Department of Health Care Management, Berlin University of Technology, Berlin, Germany
Ewout van Ginneken
Department Health Care Management Faculty of Economics and Management, Technische Universität Berlin, Berlin, Germany
Reinhard Busse

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Chen, S., Zhou, X.A. (2019). Modeling and Analysis of Cost Data. In: Levy, A., Goring, S., Gatsonis, C., Sobolev, B., van Ginneken, E., Busse, R. (eds) Health Services Evaluation. Health Services Research. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-8715-3_31

Download citation

DOI: https://doi.org/10.1007/978-1-4939-8715-3_31
Published: 12 February 2019
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-8714-6
Online ISBN: 978-1-4939-8715-3
eBook Packages: MedicineReference Module Medicine

Publish with us

Policies and ethics