Measuring Agreement in Method Comparison Studies — A Review

Choudhary, Pankaj K.; Nagaraja, H. N.

doi:10.1007/0-8176-4422-9_13

Pankaj K. Choudhary⁵ &
H. N. Nagaraja⁶

Part of the book series: Statistics for Industry and Technology ((SIT))

2282 Accesses
14 Citations

Abstract

Assessment of agreement between two or more methods of measurement is of considerable importance in many areas. In particular, in medicine, new methods or devices that are cheaper, easier to use, or less invasive, are routinely developed. Agreement between a new method and a traditional reference or gold standard must be evaluated before the new one is put into practice. Various methodologies have been proposed for this purpose in recent years. We review the literature focussing on the assessment of agreement between two methods, and on the selection of the best when several methods are compared with a reference. A real data set is analyzed to illustrate the various approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Altman, D. G. and Bland, J. M. (1983). Measurement in medicine: The analysis of method comparison studies, The Statistician, 32, 307–317.
Article Google Scholar
Anderson, S. and Hauck, W. W. (1990). Consideration of individual bioequivalence, Journal of Pharmacokinetics and Biopharmaceutics, 18, 259–274.
Article Google Scholar
Atkinson, G. and Nevill, A. (1997). Comment on the use of concordance correlation to assess the agreement between two variables, Biometrics, 53, 775–777.
Google Scholar
Banerjee, M., Capozzoli, M., McSweeney, L., and Sinha, D. (1999). Beyond Kappa: A review of interrater agreement measures, The Canadian Journal of Statistics, 27, 3–23.
Article MATH MathSciNet Google Scholar
Barnhart, H. X., Haber, M., and Song, J. L. (2002). Overall concordance correlation coefficient for evaluating agreement among multiple observers, Biometrics, 58, 1020–1027.
Article MathSciNet Google Scholar
Barnhart, H. X. and Williamson, J. M. (2001). Modeling concordance correlation via GEE to evaluate reproducibiltiy, Biometrics, 57, 931–940.
Article MathSciNet Google Scholar
Bartko, J. J. (1994). Measures of agreement: A single procedure, Statistics in Medicine, 13, 737–745.
Article Google Scholar
Berger, R. L. and Hsu, J. C. (1996). Bioequivalence trials, intersection-union tests and equivalence confidence sets, Statistical Science, 11, 283–319.
Article MATH MathSciNet Google Scholar
Bland, J. M. and Altman, D. G. (1986). Statistical methods for assessing agreement between two methods of clinical measurement, Lancet, i, 307–310.
Google Scholar
Bland, J. M. and Altman, D. G. (1990). A note on the use of the intraclass correlation coefficient in the evaluation of agreement between two methods of measurement, Computers in Biology and Medicine, 20, 337–340.
Article Google Scholar
Bland, J. M. and Altman, D. G. (1995a). Comparing two methods of clinical measurement: A personal history, International Journal of Epidemiology, 24, S7–S14.
Google Scholar
Bland, J. M. and Altman, D. G. (1995b). Comparing methods of measurement: Why plotting difference against standard method is misleading, Lancet, 346, 1085–1087.
Article Google Scholar
Bland, J. M. and Altman, D. G. (1999). Measuring agreement in method comparison studies, Statistical Methods in Medical Research, 8, 135–160.
Article Google Scholar
Cameron, J. M. (1982). Calibration, In Encyclopedia of Statistical Sciences, 1, John Wiley & Sons, New York. pp. 346–351.
Google Scholar
Casella, G. and Berger, R. (2002) Statistical Inference, 2nd edition, Duxbury Press, Pacific Grove, CA.
Google Scholar
Chinchilli, V. M., Martel, J. K., Kumanyika, S., and Lloyd, T. (1996). A weighted concordance correlation coefficient for repeated measurement designs, Biometrics, 52, 341–353.
Article MATH Google Scholar
Choudhary, P. K. (2002). Assessment of Agreement and Selection of the Best Instrument in Method Comparison Studies, Ph.D. Dissertation, The Ohio State University, Columbus, OH.
Google Scholar
Choudhary, P. K. and Nagaraja, H. N. (2004a). Tests for assessment of agreement using probability criteria, Submitted for publication.
Google Scholar
Choudhary, P. K. and Nagaraja, H. N. (2004b). Assessment of agreement using intersection-union principle, Biometrical Journal (to appear).
Google Scholar
Choudhary, P. K. and Nagaraja, H. N. (2004c). A two-stage procedure for selection and assessment of agreement of the best with a gold standard, Sequential Analysis (to appear).
Google Scholar
Choudhary, P. K. and Nagaraja, H. N. (2004d). Selecting the instrument closest to a gold standard, Journal of Statistical Planning and Inference (to appear).
Google Scholar
David, H. A. and Nagaraja, H. N. (2003). Order Statistics, Third edition, John Wiley & Sons, New York.
MATH Google Scholar
Dunn, G. (1989). Design and Analysis of Reliability Studies: The Statistical Evaluation of Measurement Errors, Oxford University Press, New York.
MATH Google Scholar
Dunn, G. (1992). Design and analysis of reliability studies, Statistical Methods in Medical Research, 1, 123–157.
Article Google Scholar
Dunn, G. and Roberts, C. (1999). Modelling method comparison data, Statistical Methods in Medical Research, 8, 161–179.
Article Google Scholar
Fleiss, J. L. (1981). Statistical Methods for Rates and Proportions, John Wiley & Sons, New York.
MATH Google Scholar
Fleiss, J. L. (1986). The Design and Analysis of Clinical Experiments, John Wiley & Sons, New York.
MATH Google Scholar
Fuller, W. A. (1987). Measurement Error Models, John Wiley & Sons, New York.
MATH Google Scholar
Guttman, I. (1988). Statistical tolerance regions, Encyclopedia of Statistical Sciences, 9, pp. 272–287, John Wiley & Sons, New York.
Google Scholar
Grubbs, F. E. (1982). Grubbs’ estimators, In Encyclopedia of Statistical Sciences, 2, pp. 542–549, John Wiley & Sons, New York.
Google Scholar
Gupta, S. S. and Panchapakesan, S. (1979). Multiple Decision Procedures — Theory and Methodology of Selecting and Ranking Populations, John Wiley, New York. Republished by SIAM, Philadelphia, 2002.
MATH Google Scholar
Hamilton, D. C. and Lesperance, M. L. (1995). A comparison of methods for univariate and multivariate acceptance sampling by variables, Technometrics, 37, 329–339.
Article MATH Google Scholar
Harris, I. R., Burch, B. D. and St. Laurent, R. T. (2001). A blended estimator for measure of agreement with a gold standard, Journal of Agricultural, Biological, and Environmental Statistics, 6, 326–339.
Article Google Scholar
Hawkins, D. M. (2002). Diagnostics for conformity of paired quantitative measurements, Statistics in Medicine, 21, 1913–1935.
Article Google Scholar
Hsu, J. C. (1996). Multiple Comparisons: Theory and Methods, Chapman & Hall/CRC, Boca Raton, FL.
MATH Google Scholar
Hutson, A. D., Wilson, D. C., and Geiser, E. A. (1998). Measuring relative agreement: Echocardiographer versus computer, Journal of Agricultural, Biological, and Environmental Statistics, 3, 163–174.
Article MathSciNet Google Scholar
Kelly, G. E. (1985). Use of structural equations model in assessing the reliability of a new measurement technique, Applied Statistics, 34, 258–263.
Article Google Scholar
King, T. S. and Chinchilli, V. M. (2001). A generalized concordance correlation coefficient for continuous and categorical data, Statistics in Medicine, 20, 2131–2147.
Article Google Scholar
Kraemer, H. C., Periyakoil, V. S., and Noda, A. (2002). Kappa coefficients in medical research, Statistics in Medicine, 21, 2109–2129.
Article Google Scholar
Krummenauer, F. (1999). Intraindividual scale comparison in clinical diagnostic methods: A review of elementary methods, Biometrical Journal, 41, 917–929.
Article MATH Google Scholar
Lee, J., Koh, D., and Ong, C. N. (1989). Statistical evaluation of agreement between two methods for measuring a quantitative variable, Computers in Biology and Medicine, 19, 61–70.
Article Google Scholar
Lewis, P. A., Jones, P. W., Polak, J. W., and Tillotson, H. T. (1991). The problem of conversion in method comparison studies, Applied Statistics, 40, 105–112.
Article MATH Google Scholar
Liao, J. and Lewis, J. (2000). An agreement curve, Presented at the Joint Statistical Meetings, Indianapolis, IN.
Google Scholar
Lin, L. I. (1989). A concordance correlation coefficient to evaluate reproducibility, Biometrics, 45, 255–268. Corrections: 2000, 56, 324–325.
Article MATH Google Scholar
Lin, L. I. (1992). Assay validation using the concordance correlation coefficient, Biometrics, 48, 599–604.
Article Google Scholar
Lin, L. I. (2000). Total deviation index for measuring individual agreement with applications in laboratory performance and bioequivalence, Statistics in Medicine, 19, 255–270.
Article Google Scholar
Lin, L. I. (2003). Measuring agreement. In Encyclopedia of Biopharmaceutical Statistics, 2nd edition, pp. 561–567, Marcel Dekker, New York.
Google Scholar
Lin, L. I. and Chinchilli, V. (1997). Rejoinder to the letter to the editor from Atkinson and Nevill, Biometrics, 53, 777–778.
Google Scholar
Lin, L. I., Hedayat, A. S., Sinha, B. and Yang, M. (2002). Statistical methods in assessing agreement: Models, issues, and tools, Journal of the American Statistical Association, 97, 257–270.
Article MATH MathSciNet Google Scholar
Lin, L. I. and Torbeck, L. D. (1998). Coefficient of accuracy and concordance correlation coefficient: New statistics for method comparison, PDA Journal of Pharmaceutical Science and Technology, 52, 55–59.
MATH Google Scholar
Lin, S. C., Whipple, D. M., and Ho, C. S. (1998). Evaluation of statistical equivalence using limits of agreement and associated sample size calculation, Communications in Statistics—Theory and Methods, 27, 1419–1432.
Article MATH Google Scholar
Linnet, K. (1993). Evaluation of regression procedures for method comparison studies, Clinical Chemistry, 39, 424–432.
Google Scholar
Liu, J.-P. and Chow, S.-C. (1997). A two one-sided tests procedure for assessment of individual bioequivalence, Journal of Biopharmaceutical Statistics, 7, 49–61.
Article MATH Google Scholar
McGraw, K. O. and Wong, S. P. (1996). Forming inferences about some intraclass correlation coefficients, Psychological Methods, 1, 30–46.
Article Google Scholar
Mukhopadhyay, N. and Chou, W.-S. (1984). On selecting the best component of a multivariate normal population, Sequential Analysis, 3, 1–22.
Article MATH MathSciNet Google Scholar
Müller, R. and Büttner, P. (1994). A critical discussion of intraclass correlation coefficients, Statistics in Medicine, 13, 2465–2476.
Article Google Scholar
Nickerson, C. A. (1997). Comment on “A concordance correlation coefficient to evaluate reproducibility”, Biometrics, 53, 1503–1507.
Article MATH Google Scholar
Nix, A. B. J. and Dunston, F. D. J. (1991). Maximum likelihood techniques applied to method comparison studies, Statistics in Medicine, 10, 981–988.
Article Google Scholar
Quan, H. and Shih, W. J. (1996). Assessing reproducibility by the within-subject coefficient of variation with random effects models, Biometrics, 52, 1195–1203. Correspondence, Biometrics, 56, 301–302.
Article MATH Google Scholar
Robieson, W. Z. (1999). On the weighted kappa and concordance correlation coefficient, Ph.D. Dissertation, University of Illinois at Chicago, IL.
Google Scholar
Shoukri, M. M. (1999). Measurement of Agreement, In Encyclopedia of Biostatistics, 1, pp. 103–117. John Wiley & Sons, New York.
Google Scholar
Shoukri, M. M. (2004). Measures of Interobserver Agreement, Chapman & Hall/CRC, Boca Raton, FL.
Google Scholar
St. Laurent, R. T. (1998). Evaluating agreement with a gold standard in method comparison studies, Biometrics, 54, 537–545.
Article MathSciNet Google Scholar
Vonesh, E. F., Chinchilli, V. P., and Pu, K. W. (1996). Goodness-of-fit in generalized nonlinear mixed-effects models, Bometrics, 52, 572–587.
Article MATH Google Scholar
Wang. W. and Hwang, J. T. G. (2001). A nearly unbiased test for individual bioequivalence problems using probability criteria, Journal of Statistical Planning and Inference, 99, 41–58.
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematical Sciences, University of Texas at Dallas, Richardson, TX, USA
Pankaj K. Choudhary
Department of Statistics, The Ohio State University, Columbus, OH, USA
H. N. Nagaraja

Authors

Pankaj K. Choudhary
View author publications
You can also search for this author in PubMed Google Scholar
H. N. Nagaraja
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Mathematics and Statistics, McMaster University, Hamilton, Ontario, L8S 4K1, Canada
N. Balakrishnan
Department of Statistics, Ohio State University, 19S8 Neil Avenue Cockins Hall, Room 404, Columbus, OH, 43210-1247, USA
H. N. Nagaraja
Department of Management Science and Statistics, University of Texas at San Antonio, 6900 N. Loop 1604 W, San Antonio, TX, 78249-0632, USA
N. Kannan

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Choudhary, P.K., Nagaraja, H.N. (2005). Measuring Agreement in Method Comparison Studies — A Review. In: Balakrishnan, N., Nagaraja, H.N., Kannan, N. (eds) Advances in Ranking and Selection, Multiple Comparisons, and Reliability. Statistics for Industry and Technology. Birkhäuser Boston. https://doi.org/10.1007/0-8176-4422-9_13

Download citation

DOI: https://doi.org/10.1007/0-8176-4422-9_13
Publisher Name: Birkhäuser Boston
Print ISBN: 978-0-8176-3232-8
Online ISBN: 978-0-8176-4422-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics