Skip to main content

Standards and Guidelines for Validation Practices: Development and Evaluation of Measurement Instruments

  • Chapter
  • First Online:

Part of the book series: Social Indicators Research Series ((SINS,volume 54))

Abstract

The objectives of this chapter are to provide an overview of standards and guidelines for validation practices in developing and evaluating measurement instruments, as well as to examine the extent to which these standards and guidelines are in line with the contemporary theories of validity. Standards and guidelines such as the AERA, APA, and NCME’s Standards for Educational and Psychological Testing, Food and Drug Administration (FDA) guidance for industry (Patient-Reported Outcomes Measures: Use in Medical Product Development to Support Labeling Claims), Consensus-based Standards for the Selection of Health Measurement Instruments (COSMIN) checklist, Society for Industrial and Organizational Psychology’s (SIOP) Principles for the Validation and Use of Personnel Selection Procedures, and European Federation of Psychologists’ Association’s (EFPA) test evaluation model are reviewed. These standards and guidelines cover different sources of validity and they do not appear to reflect the issues, foci, and theoretical orientations seen in contemporary views of validity (e.g., Kane, Messick, Zumbo).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The International Test Commission (ITC 2001) has guidelines on test use. Although the guidelines, as stated in the document, have implications on the development of measurement instruments, the focus is on test user competencies (e.g., knowledge, skills, abilities, and related characteristics). The ITC guidelines are therefore not included in this review.

  2. 2.

    The European Medicines Agency (EMA 2005) published a document providing broad recommendations on the use of health-related qualify of life (HRQoL), a specific type of patient-reported outcomes (PRO), in their medical product evaluation process. The EMA explicitly states that it is a reflection paper, not guidance. Therefore, the EMA document is not included in the present review.

References

  • American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1985). Standards for education and psychological testing. Washington, DC: American Psychological Association.

    Google Scholar 

  • American Educational Research Association (AERA), American Psychological Association (APA), & National Council on Measurement in Education (NCME). (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.

    Google Scholar 

  • American Psychological Association. (1952). Committee on test standards. Technical recommendations for psychological tests and diagnostic techniques: A preliminary proposal. American Psychologist, 7, 461–465.

    Article  Google Scholar 

  • American Psychological Association. (1954). Technical recommendations for psychological tests and diagnostic techniques. Psychological Bulletin, 51, 201–238.

    Article  Google Scholar 

  • American Psychological Association. (2002a). Criteria for practice guideline development and evaluation. American Psychologist, 57, 1048–1051.

    Article  Google Scholar 

  • American Psychological Association. (2002b). Criteria for evaluating treatment guidelines. American Psychologist, 57, 1052–1059.

    Article  Google Scholar 

  • American Psychological Association, American Educational Research Association, & National Council on Measurement in Education. (1966). Standards for educational and psychological tests and manuals. Washington, DC: American Psychological Association.

    Google Scholar 

  • American Psychological Association, American Educational Research Association, & National Council on Measurement in Education. (1974). Standards for educational and psychological tests. Washington, DC: American Psychological Association.

    Google Scholar 

  • Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity. Psychological Review, 111, 1061–1071.

    Article  Google Scholar 

  • Carlson, J. F., & Geisinger, K. F. (2012). Test reviewing at the Buros Center for Testing. International Journal of Testing, 12, 122–135.

    Article  Google Scholar 

  • Cronbach, L. J. (1988). Five perspectives on validation argument. In H. Wainer & H. Braun (Eds.), Test validity (pp. 3–17). Hillsdale: Lawrence Erlbaum Associates.

    Google Scholar 

  • DeMuro, C., Clark, M., Mordin, M., Fehnel, S., Copley-Merriman, C., & Gnanasakthy, A. (2012). Reasons for rejection of patient-reported outcome label claims: A compilation based on a review of patient-reported outcome use among new molecular entities and biologic license applications, 2006–2010. Value in Health, 15, 443–448.

    Article  Google Scholar 

  • Eccles, M. P., Grimshaw, J. M., Shekelle, P., Schünemann, H. J., & Woolf, S. (2012). Developing clinical practice guidelines: Target audiences, identifying topics for guidelines, guideline group composition and functioning and conflicts of interest. Implementation Science, 7, 60.

    Article  Google Scholar 

  • European Medicines Agency, Committee for Medicinal Products for Human Use. (2005). Reflection paper on the regulatory guidance for the use of Health-Related Quality of Life [HRQL] measures in the evaluation of medicinal products. London: Author.

    Google Scholar 

  • Evers, A., Muñiz, J., Hagemeister, C., Høstmælingen, A., Lindley, P., Sjöberg, A., & Bartram, D. (2013). Assessing the quality of tests: Revision of the EFPA review model. Psicothema, 25, 283–291.

    Google Scholar 

  • Food and Drug Administration (2009) Guidance for industry: Patient-reported outcome measures: Use in medical product development to support labeling claims. Rockville: Department of Health and Human Services, Food and Drug Administration, Center for Drug Evaluation and Research.

    Google Scholar 

  • Hubley, A. M., & Zumbo, B. D. (1996). A dialectic on validity: Where we have been and where we are going. The Journal of General Psychology, 123, 207–215.

    Article  Google Scholar 

  • Hubley, A. M., & Zumbo, B. D. (2011). Validity and the consequences of test interpretation and use. Social Indicators Research, 103, 219–230.

    Article  Google Scholar 

  • Hubley, A. M., & Zumbo, B. D. (2013). Psychometric characteristics of assessment procedures: An overview. In K. F. Geisinger (Ed.), APA handbook of testing and assessment in psychology (Vol. 1, pp. 3–19). Washington, DC: American Psychological Association Press.

    Google Scholar 

  • International Test Commission. (2001). International guidelines for test use. International Journal of Testing, 1, 93–114.

    Article  Google Scholar 

  • Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 17–64). Westport: American Council on Education/Praeger.

    Google Scholar 

  • Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50, 1–73.

    Article  Google Scholar 

  • Markus, K. A., & Borsboom, D. (2013). Frontiers of test validity theory: Measurement, causation, and meaning. New York: Routledge.

    Google Scholar 

  • Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York: American Council on Education/Macmillan.

    Google Scholar 

  • Mokkink, L. B., Terwee, C. B., Gibbons, E., Stratford, P. W., Alonso, J., Patrick, D. L., Knol, D. L., Bouter, L. M., & De Vet, H. C. W. (2010a). Inter-rater agreement and reliability of the COSMIN (COnsensus-Based Standards for the Selection of Health Measurement Instruments) checklist. BMC Medical Research Methodology, 10, 82.

    Article  Google Scholar 

  • Mokkink, L. B., Terwee, C. B., Patrick, D. L., Alonso, J., Stratford, P. W., Knol, D. L., Bouter, L. M., & De Vet, H. C. W. (2010b). The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: An international Delphi study. Quality of Life Research, 19, 539–549.

    Article  Google Scholar 

  • Mokkink, L. B., Terwee, C. B., Patrick, D. L., Alonso, J., Stratford, P. W., Knol, D. L., Bouter, L. M., & de Vet, H. C. W. (2010c). The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. Journal of Clinical Epidemiology, 63, 737–745.

    Article  Google Scholar 

  • Scientific Advisory Committee of the Medical Outcomes Trust. (2002). Assessing health status and quality-of-life instruments: attributes and review criteria. Quality of Life Research, 11, 193–205.

    Article  Google Scholar 

  • Shekelle, P. G., Woolf, S. H., Eccles, M., & Grimshaw, J. (1999). Clinical guidelines: Developing guidelines. British Medical Journal, 318, 593–596.

    Article  Google Scholar 

  • Society for Industrial and Organizational Psychology. (2003). Principles for the validation and use of personnel selection procedures (4th ed.). Bowling Green: Author.

    Google Scholar 

  • The AGREE Collaboration. (2003). Development and validation of an international appraisal instrument for assessing the quality of clinical practice guidelines: The AGREE project. Quality and Safety in Health Care, 12, 18–23.

    Article  Google Scholar 

  • Valderas, J. M., Ferrer, J., Mendívil, M., et al. (2008). Development of EMPRO: A tool for the standardized assessment of patient-reported outcome measures. Value in Health, 11, 700–708.

    Article  Google Scholar 

  • Woolf, S. H., Grol, R., Hutchinson, A., Eccles, M., & Grimshaw, J. (1999). Clinical guidelines: Potential benefits, limitations, and harms of clinical guidelines. British Medical Journal, 318, 527–530.

    Article  Google Scholar 

  • Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores. Ottawa: Directorate of Human Resources Research and Evaluation, Department of National Defense.

    Google Scholar 

  • Zumbo, B. D. (2007). Validity: Foundational issues and statistical methodology. In C. R. Rao & S. Sinharay (Eds.), Psychometrics (Handbook of statistics, Vol. 26, pp. 45–79). Amsterdam: Elsevier.

    Chapter  Google Scholar 

  • Zumbo, B. D. (2009). Validity as contextualized and pragmatic explanation, and its implications for validation practice. In R. W. Lissitz (Ed.), The concept of validity: Revisions, new directions and applications (pp. 65–82). Charlotte: IAP – Information Age Publishing.

    Google Scholar 

Download references

Acknowledgement

I thank Professor Bruno Zumbo for comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eric K. H. Chan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Chan, E.K.H. (2014). Standards and Guidelines for Validation Practices: Development and Evaluation of Measurement Instruments. In: Zumbo, B., Chan, E. (eds) Validity and Validation in Social, Behavioral, and Health Sciences. Social Indicators Research Series, vol 54. Springer, Cham. https://doi.org/10.1007/978-3-319-07794-9_2

Download citation

Publish with us

Policies and ethics