Abstract
The analysis of method comparison data is mainly concerned with evaluating agreement between methods of measuring a continuous variable. The methodology commonly assumes normally distributed data, which are usually modeled using a standard linear mixed model that assumes normality for both random effects and errors. In practice, however, the data often exhibit skewness and have tails heavier than those of a normal distribution, possibly due to outlying observations. When such data are analyzed using the standard mixed model, the non-normality may become apparent from model diagnostics. This article develops a methodology for agreement evaluation by modeling data using a recent robust mixed model that assumes a skew-t distribution for random effects and an independent t-distribution for errors. As the standard model is a special case of the robust model, the new methodology offers a unified framework for analyzing data with skewness and heavy tails as well as normally distributed data. The methodology is presented for both unreplicated and replicated data. A real example is used for illustration.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arellano-Valle, R.B., H. Bolfarine, and V.H. Lachos. 2005. Skew-normal linear mixed models. Journal of Data Science 3: 415–438.
Azzalini, A. 1985. A class of distributions which includes the normal ones. Scandinavian Journal of Statistics 12: 171–178.
Azzalini, A., and A. Capitanio. 2003. Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t distribution. Journal of the Royal Statistical Society, Series B 65: 367–389.
Barnhart, H.X., and J.M. Williamson. 2001. Modeling concordance correlation via GEE to evaluate reproducibility. Biometrics 57: 931–940.
Barnhart, H.X., M.J. Haber, and L.I. Lin. 2007. An overview on assessing agreement with continuous measurement. Journal of Biopharmaceutical Statistics 17: 529–569.
Barnhart, H.X., M.J. Haber, and J. Song. 2002. Overall concordance correlation coefficient for evaluating agreement among multiple observers. Biometrics 58: 1020–1027.
Barnhart, H.X., J. Song, and M.J. Haber. 2005. Assessing intra, inter and total agreement with replicated readings. Statistics in Medicine 24: 1371–1384.
Bland, J.M., and D.G. Altman. 1999. Measuring agreement in method comparison studies. Statistical Methods in Medical Research 8: 135–160.
Carrasco, J.L., and L. Jover. 2003. Estimating the generalized concordance correlation coefficient through variance components. Biometrics 59: 849–858.
Carrasco, J.L., L. Jover, T.S. King, and V.M. Chinchilli. 2007. Comparison of concordance correlation coefficient estimating approaches with skewed data. Journal of Biopharmaceutical Statistics 17: 673–684.
Carrasco, J.L., T.S. King, and V.M. Chinchilli. 2009. The concordance correlation coefficient for repeated measures estimated by variance components. Journal of Biopharmaceutical Statistics 19: 90–105.
Carstensen, B. 2010. Comparing Clinical Measurement Methods: A Practical Guide. New York: Wiley.
Carstensen, B. Simpson, J. Gurrin, L.C. 2008. Statistical models for assessing agreement in method comparison studies with replicate measurements. The International Journal of Biostatistics 4. doi:10.2202/1557-4679.1107
Choudhary, P.K. 2008. A tolerance interval approach for assessment of agreement in method comparison studies with repeated measurements. Journal of Statistical Planning and Inference 138: 1102–1115.
Choudhary, P.K. 2010. A unified approach for nonparametric evaluation of agreement in method comparison studies. The International Journal of Biostatistics 6. doi:10.2202/1557-4679.1235
Choudhary, P.K., and H.N. Nagaraja. 2007. Tests for assessment of agreement using probability criteria. Journal of Statistical Planning and Inference 137: 279–290.
Choudhary, P.K., and K. Yin. 2010. Bayesian and frequentist methodologies for analyzing method comparison studies with multiple methods. Statistics in Biopharmaceutical Research 2: 122–132.
Choudhary, P.K., D. Sengupta, and P. Cassey. 2014. A general skew-t mixed model that allows different degrees of freedom for random effects and error distributions. Journal of Statistical Planning and Inference 147: 235–247.
Genton, M.G. 2004. Skew-Elliptical distributions and their applications - A journey beyond normality. Boca Raton: Chapman & Hall/CRC Press.
Gilbert, P. and Varadhan, R. 2012. numDeriv: Accurate Numerical Derivatives. R package version 2012.9-1
Ho, H.J., and T.I. Lin. 2010. Robust linear mixed models using the skew t distribution with application to schizophrenia data. Biometrical Journal 52: 449–469.
Hothorn, T., F. Bretz, and P. Westfall. 2008. Simultaneous inference in general parametric models. Biometrical Journal 50: 346–363.
Igic, B., M.E. Hauber, J.A. Galbraith, T. Grim, D.C. Dearborn, P.L.R. Brennan, C. Moskat, P.K. Choudhary, and P. Cassey. 2010. Comparison of micrometer—and scanning electron microscope-based measurements of avian eggshell thickness. Journal of Field Ornithology 81: 402–410.
Jarek, S. 2012. mvnormtest: Normality test for multivariate variables. R package version 0.1-9
King, T.S., and V.M. Chinchilli. 2001. A generalized concordance correlation coefficient for continuous and categorical data. Statistics in Medicine 20: 2131–2147.
King, T.S., and V.M. Chinchilli. 2001. Robust estimators of the concordance correlation coefficient. Journal of Biopharmaceutical Statistics 11: 83–105.
King, T.S., V.M. Chinchilli, and J.L. Carrasco. 2007. A repeated measures concordance correlation coefficient. Statistics in Medicine 26: 3095–3113.
King, T.S., V.M. Chinchilli, K.-L. Wang, and J.L. Carrasco. 2007. A class of repeated measures concordance correlation coefficients. Journal of Biopharmaceutical Statistics 17: 653–672.
Lachos, V.H., P. Ghosh, and R.B. Arellano-Valle. 2010. Likelihood based inference for skew-normal independent linear mixed models. Statistica Sinica 20: 303–322.
Lehmann, E.L. 1998. Elements of Large-Sample Theory. New York: Springer.
Lin, L.I. 1989. A concordance correlation coefficient to evaluate reproducibility. Biometrics, 45, 255–268. Corrections: 2000, 56, 324–325
Lin, L.I. 2000. Total deviation index for measuring individual agreement with applications in laboratory performance and bioequivalence. Statistics in Medicine 19: 255–270.
Lin, L.I., A.S. Hedayat, and W. Wu. 2007. A unified approach for assessing agreement for continuous and categorical data. Journal of Biopharmaceutical Statistics 17: 629–652.
Lin, L.I., A.S. Hedayat, and W. Wu. 2011. Statistical Tools for Measuring Agreement. New York: Springer.
McLachlan, G.J., and T. Krishnan. 2007. The EM algorithm and extensions, 2nd ed. New York: Wiley.
Meng, X.-L., and D.B. Rubin. 1993. Maximum likelihood estimation via the ECM algorithm: A general framework. Biometrika 80: 267–278.
Pinheiro, J.C., and D.M. Bates. 2000. Mixed-Effects models in S and S-PLUS. New York: Springer.
Pinheiro, J.C., C. Liu, and Y.N. Wu. 2001. Efficient algorithms for robust estimation in linear mixed-effects models using the multivariate \(t\) distribution. Journal of Computational and Graphical Statistics 10: 249–276.
R Core Team. 2014. R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.
Roy, A. 2009. An application of linear mixed effects model to assess the agreement between two methods with replicated observations. Journal of Biopharmaceutical Statistics 19: 150–173.
Smyth, G., Y. Hu, P. Dunn, B. Phipson, and Y. Chen. 2014. statmod: Statistical modeling. R package version 1.4.20.
Verbeke, G., and E. Lesaffre. 1996. A linear mixed-effects model with heterogeneity in the random-effects population. Journal of the American Statistical Association 91: 217–221.
Wang, C.M., and H.K. Iyer. 2008. Fiducial approach for assessing agreement between two instruments. Metrologia 45: 415–421.
Zhang, D., and M. Davidian. 2001. Linear mixed models with flexible distributions of random effects for longitudinal data. Biometrics 57: 795–802.
Acknowledgments
The authors thank Golo Maurer, Rebecca Boulton and Leanne Reaney for assistance in collection of the crab claws data. They are also thankful to a reviewer for comments that greatly improved this article.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
1.1 A.1 Definitions
Let \(\mathbf {Y}\) be a \(J \times 1\) random vector with \(\varvec{\mu }\) as a \(J \times 1\) location parameter vector and \(\varvec{\varSigma }\) as a \(J \times J\) scale matrix. Define
Also let \(\phi _J( \cdot | \varvec{\mu }, \varvec{\varSigma })\) be the density function of a \(\mathcal {N}_{J}(\varvec{\mu }, \varvec{\varSigma })\) distribution, and \(\tau (\cdot ,\nu )\) be the distribution function of a univariate t-distribution with \(\nu \) degrees of freedom. The J-dimensional skew-normal, t and skew-t distributions are defined as follows.
Definition 1
\(\mathbf {Y} \sim \mathcal {SN}_J (\varvec{\mu }, \varvec{\varSigma }, \varvec{\lambda })\) if its density function is
Definition 2
\(\mathbf {Y} \sim t_J (\varvec{\mu }, \varvec{\varSigma },\nu )\) if its density function is
with \(\text {gam} (\cdot )\) as the gamma function.
Definition 3
\( \mathbf {Y} \sim \mathcal {ST}_J (\varvec{\mu }, \varvec{\varSigma }, \varvec{\lambda }, \nu )\) if its density function is
where \(f_t (\mathbf {y} | \varvec{\mu }, \varvec{\varSigma }, \nu )\) is the density function of a \(t_J(\varvec{\mu },\varvec{\varSigma },\nu )\) distribution.
1.2 A.2 Hierarchical Representation for a GST Mixed Model
Let \(\mathbf {Y}\) be an M-vector obtained by dropping the subscript i in \(\mathbf {Y}_i\) defined by (1). From (3), the GST mixed model for \(\mathbf {Y}\) can be written as
where \(\mathbf {b}\) and \(\mathbf {e}\) are mutually independent. For a hierarchical representation of this model, define for \(v > 0\),
and let \(\mathcal {G} (\alpha , \beta )\) denote a gamma distribution with parameters \(\alpha , \beta > 0\), and density
Now from [18], the model (A.1) can be represented as
1.3 A.3 Linear Combination of Skew-Normals
Proposition 1
Let \(\mathbf {Y} \sim \mathcal {SN}_q (\varvec{\beta }, \varvec{\varPsi }, \varvec{\lambda })\) and consider the quantities defined in (4). Let \(\mathbf {a} \in \mathbb {R}^q\) with at least one non-zero element. Then
Proof
The proof relies on a stochastic representation of a skew-normal variate. Let \(\mathbf {Y}^* \sim \mathcal {SN}_q (\mathbf {0}, \varvec{\varPsi }, \varvec{\lambda })\). Then, from [1],
where \(G_1^* \sim \mathcal {N}_1 (0, 1)\), \(\mathbf {G}_2^* \sim \mathcal {N}_q (\mathbf {0}, \mathbf {I}_q)\) independently of \(G_1^*\), and the notation “\(\overset{d}{=}\)” means “equal in distribution.” Using (A.5), we can write
where \(G^* \sim \mathcal {N}_1 (0, 1)\) independently of \(G_1^*\). Define
From an application of (4), we have \( (\mathbf {a}^\prime \varvec{\varPsi }^{1/2} \varvec{\delta })^2 + \mathbf {a}^\prime \varvec{\varGamma } \mathbf {a} = \mathbf {a}^\prime \varvec{\varPsi } \mathbf {a}\), implying
This allows us to write
Now the result follows from the representation (A.5) for the univariate case.
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Sengupta, D., Choudhary, P.K., Cassey, P. (2015). Modeling and Analysis of Method Comparison Data with Skewness and Heavy Tails. In: Choudhary, P., Nagaraja, C., Ng, H. (eds) Ordered Data Analysis, Modeling and Health Research Methods. Springer Proceedings in Mathematics & Statistics, vol 149. Springer, Cham. https://doi.org/10.1007/978-3-319-25433-3_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-25433-3_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25431-9
Online ISBN: 978-3-319-25433-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)