Advertisement

Journal of Statistical Theory and Practice

, Volume 11, Issue 4, pp 515–530 | Cite as

Copula regression models for discrete and mixed bivariate responses

  • Yuhui Chen
  • Timothy Hanson
Article

Abstract

Estimation of the dependencies between bivariate discrete or mixed responses can be difficult. In this article, we propose a copula-based model with latent variables associated with discrete margins to account for correlations between bivariate discrete responses. Furthermore, we generalize this strategy for jointly modeling the dependencies between mixed responses in regression mixed models. The proposed method allows the adoption of flexible discrete margins and copula functions for various types of data. Maximum likelihood is used for model estimation; particularly, the estimation for bivariate responses in copula-based regression mixed models can be implemented using the SAS PROC NLMIXED procedure via adaptive Gaussian quadrature. In addition, a mixed model with non-Gaussian random effects can also be easily fitted using the same SAS procedure after reformulating the likelihood function by multiplying and dividing by a Gaussian density. Simulation results show good performance for bivariate discrete or mixed outcomes ranging from noncorrelated to highly correlated responses. An analysis of student performance in California schools shows a drastic improvement in estimation precision from the joint model versus two independent fits.

Keywords

Gaussian copula generalized linear model mixed responses bivariate dependence random effects 

AMS Subject Classification

62 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aas, K., C. Czado, A. Frigessi, and H. Bakken. 2006. Pair-copula constructions of multiple dependence. Insurance: Mathematics and Economic 44:182–98.MathSciNetzbMATHGoogle Scholar
  2. Avramidis, A., and N. L’Ecuyer. 2009. Efficient correlation matching for fitting discrete multivariate distributions with arbitrary marginals and normal-copula dependence. INFORMS Journal on Computing 21:88–106.MathSciNetCrossRefGoogle Scholar
  3. Chen, Y. 2016a. A copula-based supervised learning classification for continuous and discrete data. Journal of Data Science 14:769–90.Google Scholar
  4. Chen, Y. 2016b. EWMA control charts for multivariate autocorrelated processes. Statistics and Its Interface Accepted.Google Scholar
  5. Daniels, M., and S. Normand. 2006. Longitudinal profiling of health care units based on continuous and discrete patient outcomes. Biostatistics 7:1–15.CrossRefGoogle Scholar
  6. de Leon, A., and B. Wu. 2011. Copula-based regression models for a bivariate mixed discrete and continuous outcome. Statistics in Medicine 30:175–85.MathSciNetCrossRefGoogle Scholar
  7. Demirtas, H., and D. Hedeker. 2011. A practical way for computing approximate lower and upper correlation bounds. The American Statistician 65:104–9.MathSciNetCrossRefGoogle Scholar
  8. Fitzmaurice, G., and N. Laird. 1995. Regression models for a bivariate discrete and continuous outcome with clustering. Journal of the American Statistical Association 90:845–52.MathSciNetCrossRefGoogle Scholar
  9. Fitzmaurice, G., and N. Laird. 1997. Regression models for mixed discrete and continuous outcome with potentially missing values. Biometrics 53:110–22.CrossRefGoogle Scholar
  10. Genest, C., and J. Nešlehová. 2007. A primer on copulas for count data. The Astin Bulletin 7:475–515.MathSciNetCrossRefGoogle Scholar
  11. Genz, A., F. Bretz, T. Miwa, X. Mi, F. Leisch, F. Scheipl, B. Bornkamp, and T. Hothorn. 2012. R Package “mvtnorm.” R package version 0.9-9992. https://doi.org/CRAN.R-project.org (accessed April 16, 2014).
  12. Henningsen, A. 2015. R package “mvProbit.” R package version 0.1-8. https://doi.org/CRAN.R-project.org.
  13. Kendall, M., and A. Stuart. 1977. The advanced theory of statistics. Distribution theory, 4th ed. New York, NY: Macmillan.zbMATHGoogle Scholar
  14. Kolev, N., and D. Paive. 2009. Copula-based regression models: A survey. Journal of Statistical Planning and Inference 139:3847–56.MathSciNetCrossRefGoogle Scholar
  15. Liu, L., and Z. Yu. 2008. A likelihood reformulation method in non-normal random effects models. Statistics in Medicine 27:3105–24.MathSciNetCrossRefGoogle Scholar
  16. Lumley, T. 2015. R package “survey.” R package version 3.30-3. https://doi.org/CRAN.R-project.org (accessed February 20, 2015).
  17. MacCallum, R., S. Zhang, K. Preacher, and D. Rucker. 2002. On the practice of dichotomization of quantitative variables. Psychological Methods 7:19–40.CrossRefGoogle Scholar
  18. Munkin, M., and P. Trivedi. 1999. Simulated maximum likelihood estimation of multivariate mixed-Poisson regression models, with applications. Econometric Journal 1:29–48.CrossRefGoogle Scholar
  19. Olkin, I., and R. Tate. 1961. Multivariate correlation models with mixed discrete and continuous variables. Annals of Mathematical Statistics 32:448–65.MathSciNetCrossRefGoogle Scholar
  20. Ripley, B., B. Venables, D. Bates, K. Hornik, A. Gebhardt, andD. Firth. 2015. Rpackage “MASS.” Rpackage version 7.3-45. https://doi.org/CRAN.R-project.org.
  21. Sklar, A. 1959. Fonctions de répartition à n dimensions et leurs marges. Publ. Inst. Stat. Univ. Paris 8:229–31.zbMATHGoogle Scholar
  22. Smith, M., and M. Khaled. 2012. Estimation of copula models with discrete margins via Bayesian data augmentation. Journal of the American Statistical Association 107:290–303.MathSciNetCrossRefGoogle Scholar
  23. Timbie, J., and S. Normand. 2008. A comparison of metods for combining quality and efficiency performance measures: profiling the value of hospitial care following acute myocardial infarction. Statistics in Medicine 27:1351–70.MathSciNetCrossRefGoogle Scholar
  24. Wu, B., and A. de Leon. 2013. Gaussian copula mixed models for clustered mixed outcomes, with application in developmental toxicology. Journal of Agricultural, Biological, and Environmental Statistics 19:39–56.MathSciNetCrossRefGoogle Scholar
  25. Zimmer, D., and P. Trivedi. 2006. Using trivariate copulas to model sample selection and treatment effects: Application to family health care demand. Journal of Business and Economic Statistics 24:63–76.MathSciNetCrossRefGoogle Scholar

Copyright information

© Grace Scientific Publishing, 20 Middlefield Ct, Greensboro, NC 27455 2017

Authors and Affiliations

  1. 1.Department of MathematicsThe University of AlabamaTuscaloosaUSA
  2. 2.Department of StatisticsUniversity of South CarolinaColumbiaUSA

Personalised recommendations