Abstract
Much scholarly attention is focused on psychological measurement through the paired-comparison format, which is considered to be tolerant to systematic response bias. The Thurstonian D-diffusion item response theory model was recently proposed to incorporate response-time information in this context. Because reliability is a fundamental measurement property, this study used the above model to conduct a preliminary investigation into the extent of the reliability increase achieved when incorporating response-time information into paired-comparison psychological measurement. Under some realistic conditions, our simulation revealed a practically relevant (but not very large) degree of increase. The same type of increase was also found during our analysis of a real psychological dataset containing measurements for the Big Five traits. As such, this study produced evidence supporting the collection and utilization of response time when conducting paired-comparison psychological measurement.
This is a preview of subscription content, access via your institution.


References
Apple MT, Neff P (2012) Using Rasch measurement to validate the Big Five factor marker questionnaire for a Japanese university population. J Appl Meas 13(3):276–296
Bock RD, Mislevy RJ (1982) Adaptive EAP estimation of ability in a microcomputer environment Adaptive EAP estimation of ability in a microcomputer environment. Appl Psychol Meas 6(4):431–444. https://doi.org/10.1177/014662168200600405
Brown A (2016) Item response models for forced-choice questionnaires: a common framework. Psychometrika 81(1):135–160. https://doi.org/10.1007/s11336-014-9434-9
Brown A, Maydeu-Olivares A (2011) Item response modeling of forced-choice questionnaires. Educ Psychol Meas 71(3):460–502. https://doi.org/10.1177/0013164410375112
Brown A, Maydeu-olivares A (2013) How IRT can solve problems of ipsative data in forced-choice questionnaires. Psychol Methods 18(1):36–52. https://doi.org/10.1037/a0030641
Bunji K, Okada K (2020) Joint modeling of the two-alternative multidimensional forced-choice personality measurement and its response time by a Thurstonian D-diffusion item response model. Behav Res Methods. https://doi.org/10.3758/s13428-019-01302-5(Online ahead of print)
Cao M, Drasgow F (2019) Does forcing reduce faking? A meta-analytic review of forced-choice personality measures in high-stakes situations. J Appl Psychol 104(11):1347–1368. https://doi.org/10.1037/apl0000414
Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B, Betancourt M, Riddell A (2017) Stan: a probabilistic programming language. J Stat Softw. https://doi.org/10.18637/jss.v076.i01
Cheung M, Chan W (2002) Reducing uniform response bias with ipsative measurement in multiplegroup confirmatory factor analysis. Struct Equ Model 9(1):55–77. https://doi.org/10.1207/S15328007SEM0901_4
Cronbach LJ (1951) Coefficient alpha and the internal s structure of tests. Psychometrika 16(3):297–334. https://doi.org/10.1007/BF02310555
Ferrando PJ, Lorenzo-Seva U (2007a) An item response theory model for incorporating response time data in binary personality items. Appl Psychol Meas 31(6):525–543. https://doi.org/10.1177/0146621606295197
Ferrando PJ, Lorenzo-Seva U (2007b) A measurement model for Likert responses that incorporates response time. Multivar Behav Res 42(4):675–706. https://doi.org/10.1080/00273170701710247
Holden R (1993) Can personality test item response latencies have construct validity? Issues of reliability and convergent and discriminant validity. Personal Individ Differ 15(3):243–248. https://doi.org/10.1016/0191-8869(93)90213-M
Ratcliff R (1978) A theory of memory retrieval. Psychol Rev 85(2):59–108. https://doi.org/10.1037/0033-295X.85.2.59
Revelle W, Condon DM (2019) Reliability from α to ω: a tutorial. Psychol Assess. https://doi.org/10.1037/pas0000754
Tuerlinckx F, De Boeck P (2005) Two interpretations of the discrimination parameter. Psychometrika 70(4):629–650. https://doi.org/10.1007/s11336-000-0810-3
Vandekerckhove J (2014) A cognitive latent variable model for the simultaneous analysis of behavioral and personality data. J Math Psychol 60:58–71. https://doi.org/10.1016/j.jmp.2014.06.004
van der Maas HLJ, Molenaar D, Maris G, Kievit RA, Borsboom D (2011) Cognitive psychology meets psychometric theory: on the relation between process models for decision making and latent variable models for individual differences. Psychol Rev 118(2):339–356. https://doi.org/10.1037/a0022749
Voss A, Nagler M, Lerche V (2013) Diffusion models in experimental psychology: a practical introduction. Exp Psychol 60(6):385–402. https://doi.org/10.1027/1618-3169/a000218
Funding
This work was supported by JSPS KAKENHI Grant numbers 17H04787, 17J07674, and 19H00616.
Author information
Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Kyosuke Bunji is now at Benesse Educational Research and Development Institute, Tokyo, Japan.
Communicated by Kentaro Kato.
About this article
Cite this article
Okada, K., Bunji, K. Increase of reliability by incorporating response time into the paired-comparison psychological measurement. Behaviormetrika 48, 169–177 (2021). https://doi.org/10.1007/s41237-020-00109-5
Received:
Accepted:
Published:
Issue Date:
Keywords
- Response time
- Reliability
- Paired-comparison
- Thurstonian D-diffusion IRT