The Effect of Method Characteristics on Retest Score Gains and Criterion-Related Validity

Villado, Anton J.; Randall, Jason G.; Zimmer, Christina U.

doi:10.1007/s10869-015-9408-7

The Effect of Method Characteristics on Retest Score Gains and Criterion-Related Validity

Original Paper
Published: 03 July 2015

Volume 31, pages 233–248, (2016)
Cite this article

Journal of Business and Psychology Aims and scope Submit manuscript

Anton J. Villado¹,
Jason G. Randall¹ &
Christina U. Zimmer¹

736 Accesses
14 Citations
Explore all metrics

Abstract

Purpose

We sought to empirically assess the effect of predictor method characteristics (test form, item-type, and test-type) on retest score change associated with an invariant construct—general mental ability (GMA)—and to evaluate the effect of retesting on the criterion-related validity of assessments that vary in their susceptibility to retest effects.

Design

Three hundred seven individuals completed a battery of GMA assessments. After a 6-week interval, participants returned to the testing site to retest using both alternate and identical forms of the initial assessments.

Findings

Greater score gains were observed on assessments comprising heterogeneous item-types than homogeneous item-types, and on performance-based assessments than self-report assessments. However, despite variations in score gains, the relationships between the initial test scores and criterion scores were no different than the relationships between retest scores and criterion scores for all assessments.

Implications

Tests and procedures that reduce reliance on test- or item-specific knowledge and skill may help minimize score changes due to retesting across multiple administrations. Moreover, under the boundary conditions present in this study, the criterion-related validity of ability assessments may not be affected by increases in test-specific knowledge and skills.

Originality/Value

Despite the prevalence and industry support of retesting, a comprehensive understanding of retest score change still eludes researchers and practitioners. This ambiguity may be due in part to neglecting the method-construct distinctions in the retest literature. This is the first report to explicitly utilize the method-construct distinction in an effort to examine the causes and consequences of retest effects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Theoretically-Consistent Cognitive Ability Test Development and Score Interpretation

Article 20 March 2018

Ecological Framework of Item Responding as Validity Evidence: An Application of Multilevel DIF Modeling Using PISA Data

Developing a Concept Map for Rasch Measurement Theory

Notes

The Wonderlic PT User’s Manual states that retesting should always be conducted using an alternate form of the test (Wonderlic, Inc. 2002). Allowing test-takers to retest using the identical form of the Wonderlic PT is contrary to the user's manual and we did so to test our specific hypotheses.

References

Ackerman, P. L. (1994). Intelligence, attention, and learning: Maximal and typical performance. In D. K. Detterman (Ed.), Current topics in human intelligence (Vol. 4, pp. 2–27)., Theories of intelligence Norwood, NJ: Ablex.
Google Scholar
Ackerman, P. L., & Wolman, S. D. (2007). Determinants and validity of self-estimates of abilities and self-concept measures. Journal of Experimental Psychology: Applied, 13, 57–78.
PubMed Google Scholar
Allalouf, A., & Ben-Shakhar, G. (1998). The effect of coaching on the predictive validity of scholastic aptitude tests. Journal of Educational Measurement, 35, 31–47.
Article Google Scholar
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
Google Scholar
Anastasi, A. (1981). Coaching, test sophistication, and developed abilities. American Psychologist, 36, 1086–1093.
Article Google Scholar
Arthur, W, Jr, Glaze, R. M., Villado, A. J., & Taylor, J. E. (2009). Unproctored internet-based tests of cognitive ability and personality: Magnitude and extent of cheating and response distortion. Industrial and Organizational Psychology: Perspectives on Science and Practice, 2, 39–45.
Article Google Scholar
Arthur, W, Jr, Glaze, R. M., Villado, A. J., & Taylor, J. E. (2010). The magnitude and extent of cheating and response distortion effects on unproctored internet-based tests of cognitive ability and personality. International Journal of Selection and Assessment, 18, 1–16.
Article Google Scholar
Arthur, W, Jr, & Villado, A. J. (2008). The importance of distinguishing between constructs and methods when comparing predictors in personnel selection research and practice. Journal of Applied Psychology, 93, 435–442.
Article PubMed Google Scholar
Austin, E. J., Deary, I. J., Gibson, G. J., McGregor, M. J., & Dent, J. B. (1998). Individual response spread in self-report scales: Personality correlations and consequences. Personality and Individual Differences, 24, 421–438.
Article Google Scholar
Birkeland, S. A., Manson, T. M., Kisamore, J. L., Brannick, M. T., & Smith, M. A. (2006). A meta-analytic investigation of job applicant faking on personality measures. International Journal of Selection and Assessment, 14, 317–335.
Article Google Scholar
Bors, D. A., & Stokes, T. L. (1998). Raven’s Advanced Progressive Matrices: Norms for first-year university students and the development of a short form. Educational and Psychological Measurement, 58, 382–398.
Article Google Scholar
Brown, R. P., & Day, E. A. (2006). The difference isn’t black and white: Stereotype threat and the race gap on Raven’s Advanced Progressive Matrices. Journal of Applied Psychology, 91, 979–985.
Article PubMed Google Scholar
Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies. New York: Cambridge University Press.
Book Google Scholar
Cattell, R. B. (1943). The measurement of adult intelligence. Psychological Bulletin, 40, 153–193.
Article Google Scholar
Cattell, R. B. (1987). Intelligence: Its structure, growth, and action. Amsterdam: Elsevier.
Google Scholar
Cronbach, L. J. (1946). Response sets and test validity. Educational and Psychological Measurement, 6, 485–494.
Google Scholar
Cronbach, L. J. (1949). Essentials of Psychological Testing. New York: Harper.
Google Scholar
Cronbach, L. J. (1950). Further evidence on response sets and test design. Educational and Psychological Measurement, 10, 3–31.
Article Google Scholar
Deary, I. J., Whiteman, M. C., Starr, J. M., Whalley, L. J., & Fox, H. C. (2004). The impact of childhood intelligence on later life: Following up the Scottish Mental Survey of 1932 and 1947. Journal of Personality and Social Psychology, 86, 130–147.
Article PubMed Google Scholar
Downie, J. (1994). Characteristics of MCAT examinees: 1992–1993. Washington, DC: Association of American Medical Colleges.
Google Scholar
Dunlap, W. P., Cortina, J. M., Vaslow, J. B., & Burke, M. J. (1996). Meta-analysis of experiments with matched groups of repeated measures designs. Psychological Methods, 1, 170–177. doi:10.1037/1082-989X.1.2.170.
Article Google Scholar
Ellingson, J. E., Heggestad, E. D., & Makarius, E. E. (2012). Personality retesting for managing intentional distortion. Journal of Personality and Social Psychology, 102, 1063–1076.
Article PubMed Google Scholar
Flowers, K. (1996). Characteristics of MCAT examinees: 1994–1995. Washington, D.C.: Association of American Medical Colleges.
Google Scholar
Freund, P. A., & Kasten, N. (2012). How smart do you think you are? A meta-analysis on the validity of self-estimates of cognitive ability. Psychological Bulletin, 138, 296–321.
Article PubMed Google Scholar
Hausknecht, J. P. (2010). Candidate persistence and personality test practice effects: Implications for staffing system management. Personnel Psychology, 63, 299–324.
Article Google Scholar
Hausknecht, J. P., Halpert, J. A., Di Paolo, N. T., & Moriarty Gerrard, M. O. (2007). Retesting in selection: A meta-analyses of coaching and practice effects for tests of cognitive ability. Journal of Applied Psychology, 92, 373–385.
Article PubMed Google Scholar
Hausknecht, J. P., Trevor, C. O., & Farr, J. L. (2002). Retaking ability tests in a selection setting: Implications for practice effects, training performance, and turnover. Journal of Applied Psychology, 87, 243–254.
Article PubMed Google Scholar
Hogan, J., Barrett, P., & Hogan, R. (2007). Personality measurement, faking, and employment selection. Journal of Applied Psychology, 92, 1270–1285.
Article PubMed Google Scholar
Huffcutt, A. I., Conway, J. M., Roth, P. L., & Stone, N. J. (2001). Identification and meta-analytic assessment of psychological constructs measured in employment interviews. Journal of Applied Psychology, 86, 897–913.
Article PubMed Google Scholar
Jensen, A. R. (1998). The g factor: The science of mental ability. Westport, CT: Praeger.
Google Scholar
Kulik, J. A., Bangert-Drowns, R. L., & Kulik, C. C. (1984a). Effectiveness of coaching for aptitude tests. Psychological Bulletin, 95, 179–188.
Article Google Scholar
Kulik, J. A., Kulik, C. C., & Bangert, R. L. (1984b). Effects of practice on aptitude and achievement test scores. American Educational Research Journal, 21, 435–447.
Article Google Scholar
Leger, K. F. (1997). Characteristics of MCAT examinees: 1996. Washington DC: Association of American Medical Colleges.
Google Scholar
Lievens, F., Buyse, T., & Sackett, P. R. (2005). Retest effects in operational selection settings: Development and test of a framework. Personnel Psychology, 58, 981–1007.
Article Google Scholar
Lievens, F., Reeve, C. L., & Heggestad, E. D. (2007). An examination of psychometric bias due to retesting on cognitive ability tests in selection settings. Journal of Applied Psychology, 92, 167–1682.
Google Scholar
Mabe, P. A., & West, S. G. (1982). Validity of self-evaluation of ability: A review and meta-analysis. Journal of Applied Psychology, 67, 280–296.
Article Google Scholar
Mangos, P. M., Thissen-Roe, A., & Robinson, R. (2012, April). Modeling retest trajectories: Trait, scoring, and practice effects. Paper presented at the Society for Industrial and Organizational Psychology, San Diego, CA.
Meng, X., Rosenthal, R., & Rubin, D. B. (1992). Comparing correlated correlation coefficients. Psychological Bulletin, 111, 172–175.
Article Google Scholar
Mrazek, M. D., Smallwood, J., Franklin, M. S., Chin, J. M., Baird, B., & Schooler, J. W. (2012). The role of mind-wandering in measurements of general aptitude. Journal of Experimental Psychology: General, 141, 788–798.
Article Google Scholar
Nathan, J. S., & Camara, W. J. (1998). Score change when retaking the SAT I: Reasoning Test (Research Note No. RN-05). New York: The College Board.
Google Scholar
Neisser, U., Boodoo, G., Bouchard, T. J, Jr, Boykin, A. W., Brody, N., Ceci, S. J., et al. (1996). Intelligence: Knowns and unknowns. American Psychologist, 51, 77–101.
Article Google Scholar
Paulhus, D. L., Lysy, D. C., & Yik, M. S. M. (1998). Self-report measures of intelligence: Are they useful as proxy IQ tests? Journal of Personality, 66, 525–554.
Article Google Scholar
Powers, D. E. (1986). Relations of test item characteristics to test preparation/test practice effects: A quantitative summary. Psychological Bulletin, 100, 67–77.
Article Google Scholar
Powers, D. E., Fowles, M. E., & Farnum, M. (1993). Prepublishing the topics for a test of writing skills: A small-scale simulation. Applied Measurement in Education, 62, 119–135.
Article Google Scholar
Raven, J. C., Raven, J., & Court, J. H. (1991). Manual for Raven’s progressive matrices and vocabulary scales (Sect. 1). Oxford: Oxford Psychologists Press.
Google Scholar
Reeve, C. L., & Lam, H. (2005). The psychometric paradox of practice effects due to retesting: Measurement invariance and stable ability estimates in the face of observed score changes. Intelligence, 33, 535–549. doi:10.1016/j.intell.2005.05.003.
Article Google Scholar
Reeve, C. L., & Lam, H. (2007). The relation between practice effects, test-taker characteristics and degree of g-saturation. International Journal of Testing, 7, 225–242.
Article Google Scholar
Roediger, H. L, I. I. I., & Karpicke, J. D. (2006). The power of testing memory: Basic research and implications for educational practice. Perspectives on Psychological Science, 1, 181–210.
Article PubMed Google Scholar
Schleicher, D. J., Van Iddekinge, C. H., Moregeson, F. P., & Campion, M. A. (2010). If at first you don’t succeed, try, try again: Understanding race, age, and gender differences in retesting score improvement. Journal of Applied Psychology, 95, 603–617.
Article PubMed Google Scholar
Society for Industrial and Organizational Psychology. (2003). Principles for the validation and use of personnel selection procedures (4th ed.). Bowling Green, OH: Author.
Google Scholar
Swann, W. G., Griffin, J., Predmore, S., & Gains, B. (1987). The cognitive-affective crossfire: When self-consistency confronts self-enhancement. Journal of Personality and Social Psychology, 52, 881–889.
Article PubMed Google Scholar
te Nijenhuis, J., van Vianen, A. E., & van der Flier, H. (2007). Score gains on g-loaded tests: No g. Intelligence, 35, 283–300.
Article Google Scholar
Tippins, N. T., Beaty, J., Drasgow, F., Gibson, W. M., Pearlman, K., Segall, D. O., & Shepherd, W. (2006). Unproctored internet testing in employment settings. Personnel Psychology, 59, 189–225.
Article Google Scholar
Tuzinski, K. A., Laczo, R. M., & Sackett, P. R. (2005, April). Impact of response distortion on retaking of cognitive and personality tests. Paper presented at the Society for Industrial and Organizational Psychology, Los Angeles, CA.
Van Iddekinge, C. H., Moregeson, F. P., Schleicher, D. J., & Campion, M. A. (2011). Can I retake it? Exploring subgroup differences and criterion-related validity in promotion retesting. Journal of Applied Psychology, 96, 941–955.
Article PubMed Google Scholar
Wonderlic, Inc. (2002). Wonderlic personnel test & scholastic level exam user’s manual. Vernon Hills, IL: Wonderlic Inc.
Google Scholar

Download references

Acknowledgments

The authors would like to thank Rebecca Ray and Ray Laughter, as well as the Lone Star College System and Houston Community College System for their assistance with data collection.

Author information

Authors and Affiliations

Psychology Department-MS 25, Rice University, P.O. Box 1892, Houston, TX, 77251-1892, USA
Anton J. Villado, Jason G. Randall & Christina U. Zimmer

Authors

Anton J. Villado
View author publications
You can also search for this author in PubMed Google Scholar
Jason G. Randall
View author publications
You can also search for this author in PubMed Google Scholar
Christina U. Zimmer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anton J. Villado.

Additional information

Received and reviewed by former editor, George Neuman.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Villado, A.J., Randall, J.G. & Zimmer, C.U. The Effect of Method Characteristics on Retest Score Gains and Criterion-Related Validity. J Bus Psychol 31, 233–248 (2016). https://doi.org/10.1007/s10869-015-9408-7

Download citation

Published: 03 July 2015
Issue Date: June 2016
DOI: https://doi.org/10.1007/s10869-015-9408-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Effect of Method Characteristics on Retest Score Gains and Criterion-Related Validity