Abstract
Purpose
We sought to empirically assess the effect of predictor method characteristics (test form, item-type, and test-type) on retest score change associated with an invariant construct—general mental ability (GMA)—and to evaluate the effect of retesting on the criterion-related validity of assessments that vary in their susceptibility to retest effects.
Design
Three hundred seven individuals completed a battery of GMA assessments. After a 6-week interval, participants returned to the testing site to retest using both alternate and identical forms of the initial assessments.
Findings
Greater score gains were observed on assessments comprising heterogeneous item-types than homogeneous item-types, and on performance-based assessments than self-report assessments. However, despite variations in score gains, the relationships between the initial test scores and criterion scores were no different than the relationships between retest scores and criterion scores for all assessments.
Implications
Tests and procedures that reduce reliance on test- or item-specific knowledge and skill may help minimize score changes due to retesting across multiple administrations. Moreover, under the boundary conditions present in this study, the criterion-related validity of ability assessments may not be affected by increases in test-specific knowledge and skills.
Originality/Value
Despite the prevalence and industry support of retesting, a comprehensive understanding of retest score change still eludes researchers and practitioners. This ambiguity may be due in part to neglecting the method-construct distinctions in the retest literature. This is the first report to explicitly utilize the method-construct distinction in an effort to examine the causes and consequences of retest effects.
Similar content being viewed by others
Notes
The Wonderlic PT User’s Manual states that retesting should always be conducted using an alternate form of the test (Wonderlic, Inc. 2002). Allowing test-takers to retest using the identical form of the Wonderlic PT is contrary to the user's manual and we did so to test our specific hypotheses.
References
Ackerman, P. L. (1994). Intelligence, attention, and learning: Maximal and typical performance. In D. K. Detterman (Ed.), Current topics in human intelligence (Vol. 4, pp. 2–27)., Theories of intelligence Norwood, NJ: Ablex.
Ackerman, P. L., & Wolman, S. D. (2007). Determinants and validity of self-estimates of abilities and self-concept measures. Journal of Experimental Psychology: Applied, 13, 57–78.
Allalouf, A., & Ben-Shakhar, G. (1998). The effect of coaching on the predictive validity of scholastic aptitude tests. Journal of Educational Measurement, 35, 31–47.
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
Anastasi, A. (1981). Coaching, test sophistication, and developed abilities. American Psychologist, 36, 1086–1093.
Arthur, W, Jr, Glaze, R. M., Villado, A. J., & Taylor, J. E. (2009). Unproctored internet-based tests of cognitive ability and personality: Magnitude and extent of cheating and response distortion. Industrial and Organizational Psychology: Perspectives on Science and Practice, 2, 39–45.
Arthur, W, Jr, Glaze, R. M., Villado, A. J., & Taylor, J. E. (2010). The magnitude and extent of cheating and response distortion effects on unproctored internet-based tests of cognitive ability and personality. International Journal of Selection and Assessment, 18, 1–16.
Arthur, W, Jr, & Villado, A. J. (2008). The importance of distinguishing between constructs and methods when comparing predictors in personnel selection research and practice. Journal of Applied Psychology, 93, 435–442.
Austin, E. J., Deary, I. J., Gibson, G. J., McGregor, M. J., & Dent, J. B. (1998). Individual response spread in self-report scales: Personality correlations and consequences. Personality and Individual Differences, 24, 421–438.
Birkeland, S. A., Manson, T. M., Kisamore, J. L., Brannick, M. T., & Smith, M. A. (2006). A meta-analytic investigation of job applicant faking on personality measures. International Journal of Selection and Assessment, 14, 317–335.
Bors, D. A., & Stokes, T. L. (1998). Raven’s Advanced Progressive Matrices: Norms for first-year university students and the development of a short form. Educational and Psychological Measurement, 58, 382–398.
Brown, R. P., & Day, E. A. (2006). The difference isn’t black and white: Stereotype threat and the race gap on Raven’s Advanced Progressive Matrices. Journal of Applied Psychology, 91, 979–985.
Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies. New York: Cambridge University Press.
Cattell, R. B. (1943). The measurement of adult intelligence. Psychological Bulletin, 40, 153–193.
Cattell, R. B. (1987). Intelligence: Its structure, growth, and action. Amsterdam: Elsevier.
Cronbach, L. J. (1946). Response sets and test validity. Educational and Psychological Measurement, 6, 485–494.
Cronbach, L. J. (1949). Essentials of Psychological Testing. New York: Harper.
Cronbach, L. J. (1950). Further evidence on response sets and test design. Educational and Psychological Measurement, 10, 3–31.
Deary, I. J., Whiteman, M. C., Starr, J. M., Whalley, L. J., & Fox, H. C. (2004). The impact of childhood intelligence on later life: Following up the Scottish Mental Survey of 1932 and 1947. Journal of Personality and Social Psychology, 86, 130–147.
Downie, J. (1994). Characteristics of MCAT examinees: 1992–1993. Washington, DC: Association of American Medical Colleges.
Dunlap, W. P., Cortina, J. M., Vaslow, J. B., & Burke, M. J. (1996). Meta-analysis of experiments with matched groups of repeated measures designs. Psychological Methods, 1, 170–177. doi:10.1037/1082-989X.1.2.170.
Ellingson, J. E., Heggestad, E. D., & Makarius, E. E. (2012). Personality retesting for managing intentional distortion. Journal of Personality and Social Psychology, 102, 1063–1076.
Flowers, K. (1996). Characteristics of MCAT examinees: 1994–1995. Washington, D.C.: Association of American Medical Colleges.
Freund, P. A., & Kasten, N. (2012). How smart do you think you are? A meta-analysis on the validity of self-estimates of cognitive ability. Psychological Bulletin, 138, 296–321.
Hausknecht, J. P. (2010). Candidate persistence and personality test practice effects: Implications for staffing system management. Personnel Psychology, 63, 299–324.
Hausknecht, J. P., Halpert, J. A., Di Paolo, N. T., & Moriarty Gerrard, M. O. (2007). Retesting in selection: A meta-analyses of coaching and practice effects for tests of cognitive ability. Journal of Applied Psychology, 92, 373–385.
Hausknecht, J. P., Trevor, C. O., & Farr, J. L. (2002). Retaking ability tests in a selection setting: Implications for practice effects, training performance, and turnover. Journal of Applied Psychology, 87, 243–254.
Hogan, J., Barrett, P., & Hogan, R. (2007). Personality measurement, faking, and employment selection. Journal of Applied Psychology, 92, 1270–1285.
Huffcutt, A. I., Conway, J. M., Roth, P. L., & Stone, N. J. (2001). Identification and meta-analytic assessment of psychological constructs measured in employment interviews. Journal of Applied Psychology, 86, 897–913.
Jensen, A. R. (1998). The g factor: The science of mental ability. Westport, CT: Praeger.
Kulik, J. A., Bangert-Drowns, R. L., & Kulik, C. C. (1984a). Effectiveness of coaching for aptitude tests. Psychological Bulletin, 95, 179–188.
Kulik, J. A., Kulik, C. C., & Bangert, R. L. (1984b). Effects of practice on aptitude and achievement test scores. American Educational Research Journal, 21, 435–447.
Leger, K. F. (1997). Characteristics of MCAT examinees: 1996. Washington DC: Association of American Medical Colleges.
Lievens, F., Buyse, T., & Sackett, P. R. (2005). Retest effects in operational selection settings: Development and test of a framework. Personnel Psychology, 58, 981–1007.
Lievens, F., Reeve, C. L., & Heggestad, E. D. (2007). An examination of psychometric bias due to retesting on cognitive ability tests in selection settings. Journal of Applied Psychology, 92, 167–1682.
Mabe, P. A., & West, S. G. (1982). Validity of self-evaluation of ability: A review and meta-analysis. Journal of Applied Psychology, 67, 280–296.
Mangos, P. M., Thissen-Roe, A., & Robinson, R. (2012, April). Modeling retest trajectories: Trait, scoring, and practice effects. Paper presented at the Society for Industrial and Organizational Psychology, San Diego, CA.
Meng, X., Rosenthal, R., & Rubin, D. B. (1992). Comparing correlated correlation coefficients. Psychological Bulletin, 111, 172–175.
Mrazek, M. D., Smallwood, J., Franklin, M. S., Chin, J. M., Baird, B., & Schooler, J. W. (2012). The role of mind-wandering in measurements of general aptitude. Journal of Experimental Psychology: General, 141, 788–798.
Nathan, J. S., & Camara, W. J. (1998). Score change when retaking the SAT I: Reasoning Test (Research Note No. RN-05). New York: The College Board.
Neisser, U., Boodoo, G., Bouchard, T. J, Jr, Boykin, A. W., Brody, N., Ceci, S. J., et al. (1996). Intelligence: Knowns and unknowns. American Psychologist, 51, 77–101.
Paulhus, D. L., Lysy, D. C., & Yik, M. S. M. (1998). Self-report measures of intelligence: Are they useful as proxy IQ tests? Journal of Personality, 66, 525–554.
Powers, D. E. (1986). Relations of test item characteristics to test preparation/test practice effects: A quantitative summary. Psychological Bulletin, 100, 67–77.
Powers, D. E., Fowles, M. E., & Farnum, M. (1993). Prepublishing the topics for a test of writing skills: A small-scale simulation. Applied Measurement in Education, 62, 119–135.
Raven, J. C., Raven, J., & Court, J. H. (1991). Manual for Raven’s progressive matrices and vocabulary scales (Sect. 1). Oxford: Oxford Psychologists Press.
Reeve, C. L., & Lam, H. (2005). The psychometric paradox of practice effects due to retesting: Measurement invariance and stable ability estimates in the face of observed score changes. Intelligence, 33, 535–549. doi:10.1016/j.intell.2005.05.003.
Reeve, C. L., & Lam, H. (2007). The relation between practice effects, test-taker characteristics and degree of g-saturation. International Journal of Testing, 7, 225–242.
Roediger, H. L, I. I. I., & Karpicke, J. D. (2006). The power of testing memory: Basic research and implications for educational practice. Perspectives on Psychological Science, 1, 181–210.
Schleicher, D. J., Van Iddekinge, C. H., Moregeson, F. P., & Campion, M. A. (2010). If at first you don’t succeed, try, try again: Understanding race, age, and gender differences in retesting score improvement. Journal of Applied Psychology, 95, 603–617.
Society for Industrial and Organizational Psychology. (2003). Principles for the validation and use of personnel selection procedures (4th ed.). Bowling Green, OH: Author.
Swann, W. G., Griffin, J., Predmore, S., & Gains, B. (1987). The cognitive-affective crossfire: When self-consistency confronts self-enhancement. Journal of Personality and Social Psychology, 52, 881–889.
te Nijenhuis, J., van Vianen, A. E., & van der Flier, H. (2007). Score gains on g-loaded tests: No g. Intelligence, 35, 283–300.
Tippins, N. T., Beaty, J., Drasgow, F., Gibson, W. M., Pearlman, K., Segall, D. O., & Shepherd, W. (2006). Unproctored internet testing in employment settings. Personnel Psychology, 59, 189–225.
Tuzinski, K. A., Laczo, R. M., & Sackett, P. R. (2005, April). Impact of response distortion on retaking of cognitive and personality tests. Paper presented at the Society for Industrial and Organizational Psychology, Los Angeles, CA.
Van Iddekinge, C. H., Moregeson, F. P., Schleicher, D. J., & Campion, M. A. (2011). Can I retake it? Exploring subgroup differences and criterion-related validity in promotion retesting. Journal of Applied Psychology, 96, 941–955.
Wonderlic, Inc. (2002). Wonderlic personnel test & scholastic level exam user’s manual. Vernon Hills, IL: Wonderlic Inc.
Acknowledgments
The authors would like to thank Rebecca Ray and Ray Laughter, as well as the Lone Star College System and Houston Community College System for their assistance with data collection.
Author information
Authors and Affiliations
Corresponding author
Additional information
Received and reviewed by former editor, George Neuman.
Rights and permissions
About this article
Cite this article
Villado, A.J., Randall, J.G. & Zimmer, C.U. The Effect of Method Characteristics on Retest Score Gains and Criterion-Related Validity. J Bus Psychol 31, 233–248 (2016). https://doi.org/10.1007/s10869-015-9408-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10869-015-9408-7