Restricted Recalibration of Item Response Theory Models

Liu, Yang; Yang, Ji Seung; Maydeu-Olivares, Alberto

doi:10.1007/s11336-019-09667-4

Restricted Recalibration of Item Response Theory Models

Published: 20 March 2019

Volume 84, pages 529–553, (2019)
Cite this article

Psychometrika Aims and scope Submit manuscript

Yang Liu¹,
Ji Seung Yang¹ &
Alberto Maydeu-Olivares^2,3

875 Accesses
7 Citations
Explore all metrics

Abstract

In item response theory (IRT), it is often necessary to perform restricted recalibration (RR) of the model: A set of (focal) parameters is estimated holding a set of (nuisance) parameters fixed. Typical applications of RR include expanding an existing item bank, linking multiple test forms, and associating constructs measured by separately calibrated tests. In the current work, we provide full statistical theory for RR of IRT models under the framework of pseudo-maximum likelihood estimation. We describe the standard error calculation for the focal parameters, the assessment of overall goodness-of-fit (GOF) of the model, and the identification of misfitting items. We report a simulation study to evaluate the performance of these methods in the scenario of adding a new item to an existing test. Parameter recovery for the focal parameters as well as Type I error and power of the proposed tests are examined. An empirical example is also included, in which we validate the pediatric fatigue short-form scale in the Patient-Reported Outcome Measurement Information System (PROMIS), compute global and local GOF statistics, and update parameters for the misfitting items.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Some recommendations for developing multidimensional computerized adaptive tests for patient-reported outcomes

Article Open access 23 February 2018

Assessing the Accuracy of Errors of Measurement. Implications for Assessing Reliable Change in Clinical settings

Article 28 August 2021

A comparison of Monte Carlo methods for computing marginal likelihoods of item response theory models

Article 17 May 2019

Notes

We assume that the response patterns have been sorted in an arbitrary but fixed order.
When the items are ordinal, the statistic defined by Eq. 20 is different from the original $M_2$ statistic proposed by Maydeu-Olivares and Joe 2006. It is the same as the $M_{ord}$ statistic in Maydeu-Olivares and Joe 2014 and the $M_2^*$ statistic in Cai and Hansen 2013.
Because the nuisance parameters $\varvec{\xi }$ were estimated by ML from the previous data $\mathbf{Y}'$, ${\varvec{\Omega }}_{\varvec{\xi }}$ amounts to the inverse Fisher information matrix with respect to the intercept and slope parameters of the first 9 items.

References

Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B (Methodological), 57, 289–300.
Article Google Scholar
Birnbaum, A. (1968). Some latent train models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 395–479). Reading, MA: Addison-Wesley.
Google Scholar
Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46(4), 443–459.
Article Google Scholar
Bock, R. D., & Lieberman, M. (1970). Fitting a response model for $n$ dichotomously scored items. Psychometrika, 35(2), 179–197.
Article Google Scholar
Bock, R. D., & Zimowski, M. F. (1997). Multiple group irt. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 433–448). New York: Springer.
Chapter Google Scholar
Box, G. E. P. (1954). Some theorems on quadratic forms applied in the study of analysis of variance problems: I. Effect of inequality of variance in the one-way classification. The Annals of Mathematical Statistics, 25(2), 290–302.
Article Google Scholar
Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press.
Book Google Scholar
Bradlow, E. T., Wainer, H., & Wang, X. (1999). A Bayesian random effects model for testlets. Psychometrika, 64(2), 153–168.
Article Google Scholar
Breithaupt, K., Ariel, A. A., & Hare, D. R. (2010). Assembling an inventory of multistage adaptive testing systems. In W. J. van der Linden & C. A. Glas (Eds.), Elements of adaptive testing (pp. 247–266). New York, NY: Springer.
Google Scholar
Browne, M. W. (2000). Cross-validation methods. Journal of Mathematical Psychology, 44(1), 108–132.
Article PubMed Google Scholar
Cai, L., & Hansen, M. (2013). Limited-information goodness-of-fit testing of hierarchical item factor models. British Journal of Mathematical and Statistical Psychology, 66(2), 245–276.
Article PubMed Google Scholar
Cai, L., Maydeu-Olivares, A., Coffman, D. L., & Thissen, D. (2006). Limited-information goodness-of-fit testing of item response theory models for sparse 2p tables. British Journal of Mathematical and Statistical Psychology, 59(1), 173–194.
Article PubMed Google Scholar
Cheng, Y., & Yuan, K.-H. (2010). The impact of fallible item parameter estimates on latent trait recovery. Psychometrika, 75, 280–291.
Article PubMed PubMed Central Google Scholar
Cochran, W. G. (1952). The ${\chi }^{2}$ test of goodness of fit. The Annals of Mathematical Statistics, 23(3), 315–345.
Article Google Scholar
Cressie, N., & Read, T. R. (1984). Multinomial goodness-of-fit tests. Journal of the Royal Statistical Society, Series B (Methodological), 46(3), 440–464.
Article Google Scholar
Curran, P. J., & Hussong, A. M. (2009). Integrative data analysis: The simultaneous analysis of multiple data sets. Psychological Methods, 14(2), 81–100.
Article PubMed PubMed Central Google Scholar
Drasgow, F., Levine, M. V., Tsien, S., Williams, B., & Mead, A. D. (1995). Fitting polytomous item response theory models to multiple-choice tests. Applied Psychological Measurement, 19(2), 143–166.
Article Google Scholar
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum.
Google Scholar
Fox, J.-P. (2005). Multilevel irt using dichotomous and polytomous response data. British Journal of Mathematical and Statistical Psychology, 58(1), 145–172.
Article PubMed Google Scholar
Glas, C. A. (1988). The derivation of some tests for the Rasch model from the multinomial distribution. Psychometrika, 53(4), 525–546.
Article Google Scholar
Glas, C. A. (1999). Modification indices for the 2-pl and the nominal response model. Psychometrika, 64(3), 273–294.
Article Google Scholar
Glas, C. A., & Suárez Falcón, J. C. (2003). A comparison of item-fit statistics for the three-parameter logistic model. Applied Psychological Measurement, 27(2), 87–106. https://doi.org/10.1177/0146621602250530.
Article Google Scholar
Gong, G., & Samaniego, F. J. (1981). Pseudo maximum likelihood estimation: Theory and applications. The Annals of Statistics, 9(4), 861–869.
Article Google Scholar
Gunsjö, A. (1994). Faktoranalys av ordinala variabler. Stockholm: Acta Universitatis Upsaliensis.
Google Scholar
Haberman, S. J. (2006). Adaptive quadrature for item response models. ETS Research Report Series, 2006(2), 1–10.
Article Google Scholar
Haberman, S. J., & Sinharay, S. (2013). Generalized residuals for general models for contingency tables with application to item response theory. Journal of the American Statistical Association, 108(504), 1435–1444.
Article Google Scholar
Haberman, S. J., Sinharay, S., & Chon, K. H. (2013). Assessing item fit for unidimensional item response theory models using residuals from estimated item response functions. Psychometrika, 78(3), 417–440.
Article PubMed Google Scholar
Haley, S. M., Ni, P., Jette, A. M., Tao, W., Moed, R., Meyers, D., et al. (2009). Replenishing a computerized adaptive test of patient-reported daily activity functioning. Quality of Life Research, 18(4), 461–471.
Article PubMed PubMed Central Google Scholar
Hofer, S. M., & Piccinin, A. M. (2009). Integrative data analysis through coordination of measurement and analysis protocol across independent longitudinal studies. Psychological Methods, 14(2), 150–164.
Joe, H., & Maydeu-Olivares, A. (2006). On the asymptotic distribution of pearson’s x2 in cross-validation samples. Psychometrika, 71(3), 587–592.
Article Google Scholar
Joe, H., & Maydeu-Olivares, A. (2010). A general family of limited information goodness-of-fit statistics for multinomial data. Psychometrika, 75(3), 393–419.
Article Google Scholar
Jöreskog, K. G., & Moustaki, I. (2001). Factor analysis of ordinal variables: A comparison of three approaches. Multivariate Behavioral Research, 36(3), 347–387.
Article PubMed Google Scholar
Kim, S. (2006). A comparative study of IRT fixed parameter calibration methods. Journal of Educational Measurement, 43(4), 355–381.
Article Google Scholar
Lai, J.-S., Stucky, B. D., Thissen, D., Varni, J. W., DeWitt, E. M., Irwin, D. E., et al. (2013). Development and psychometric properties of the promisÂő pediatric fatigue item banks. Quality of Life Research, 22(9), 2417–2427. https://doi.org/10.1007/s11136-013-0357-1.
Article PubMed Google Scholar
Liu, Y., & Maydeu-Olivares, A. (2014). Identifying the source of misfit in item response theory models. Multivariate Behavioral Research, 49(4), 354–371.
Article PubMed Google Scholar
Liu, Y., & Thissen, D. (2012). Identifying local dependence with a score test statistic based on the bifactor logistic model. Applied Psychological Measurement, 36(8), 670–688.
Article Google Scholar
Liu, Y., & Thissen, D. (2014). Comparing score tests and other local dependence diagnostics for the graded response model. British Journal of Mathematical and Statistical Psychology, 67(3), 496–513.
Article PubMed Google Scholar
Liu, Y., & Yang, J. S. (2017). Interval estimation of latent variable scores in item response theory. Journal of Educational and Behavioral Statistics. https://doi.org/10.3102/1076998617732764.
Liu, Y., & Yang, J. S. (2018). Bootstrap-calibrated interval estimates for latent variable scores in item response theory. Psychometrika, 83(2), 333–354.
Article PubMed Google Scholar
Luecht, R. M. (2006). Operational issues in computer-based testing. In D. Bartram & R. Hambleton (Eds.), Computer-based testing and the internet: Issues and advances (pp. 91–114). New York: Wiley.
Google Scholar
Magnus, J., & Neudecker, H. (1999). Matrix differential calculus with applications in statistics and econometrics. New York: Wiley.
Google Scholar
Maydeu-Olivares, A., & Joe, H. (2005). Limited-and full-information estimation and goodness-of-fit testing in $2^{n}$ contingency tables: A unified framework. Journal of the American Statistical Association, 100(471), 1009–1020.
Article Google Scholar
Maydeu-Olivares, A., & Joe, H. (2006). Limited information goodness-of-fit testing in multidimensional contingency tables. Psychometrika, 71(4), 713–732.
Article Google Scholar
Maydeu-Olivares, A., & Joe, H. (2008). An overview of limited information goodness-of-fit testing in multidimensional contingency tables. In K. Shigemasu, A. Okada, T. Imaizumi, & T. Hoshino (Eds.), New trends in psychometrics (pp. 253–262). Tokyo: Universal Academy Press.
Google Scholar
Maydeu-Olivares, A., & Joe, H. (2014). Assessing approximate fit in categorical data analysis. Multivariate Behavioral Research, 49(4), 305–328.
Article PubMed Google Scholar
Maydeu-Olivares, A., & Liu, Y. (2015). Item diagnostics in multivariate discrete data. Psychological Methods, 20(2), 276–292.
Article PubMed Google Scholar
McDonald, R. P. (1981). The dimensionality of tests and items. British Journal of Mathematical and Statistical Psychology, 34(1), 100–117.
Article Google Scholar
Meng, X.-L., & Wong, W. H. (1996). Simulating ratios of normalizing constants via a simple identity: A theoretical exploration. Statistica Sinica, 6(4), 831–860.
Google Scholar
Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58(4), 525–543.
Article Google Scholar
Mosier, C. I. (1951). Symposium: The need and means of cross-validation. i. Problems and designs of cross-validation. Educational and Psychological Measurement, 11(1), 5–11.
Article Google Scholar
Muthén, B. (1978). Contributions to factor analysis of dichotomous variables. Psychometrika, 43(4), 551–560.
Article Google Scholar
Muthén, B. (1983). Latent variable structural equation modeling with categorical data. Journal of Econometrics, 22(1–2), 43–65.
Article Google Scholar
Muthén, B. (1984). A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika, 49(1), 115–132.
Article Google Scholar
Muthén, B. (1993). Goodness of fit with categorical and other nonnormal variables. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 205–234). Newbury Park, CA: Sage.
Google Scholar
Muthén, L. K., & Muthén, B. O. (1998–2017). Mplus user’s guide [Computer software manual]. Los Angeles, CA.
Parke, W. R. (1986). Pseudo maximum likelihood estimation: The asymptotic distribution. The Annals of Statistics, 14(1), 355–357.
Article Google Scholar
R Core Team. (2018). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria. Retrieved from https://www.R-project.org/
Ranger, J., & Kuhn, J.-T. (2012). Assessing fit of item response models using the information matrix test. Journal of Educational Measurement, 49(3), 247–268.
Article Google Scholar
Rao, C. R. (1973). Linear statistical inference and its applications. New York: Wiley.
Book Google Scholar
Read, T. R. (1984). Closer asymptotic approximations for the distributions of the power divergence goodness-of-fit statistics. Annals of the Institute of Statistical Mathematics, 36(1), 59–69.
Article Google Scholar
Reiser, M. (1996). Analysis of residuals for the multinomial item response model. Psychometrika, 61(3), 509–528.
Article Google Scholar
Rubin, D. B. (1984). Bayesianly justifiable and relevant frequency calculations for the applies statistician. The Annals of Statistics, 12(4), 1151–1172.
Article Google Scholar
Rupp, A. A. (2013). A systematic review of the methodology for person fit research in item response theory: Lessons about generalizability of inferences from the design of simulation studies. Psychological Test and Assessment Modeling, 55(1), 3–38.
Google Scholar
Rupp, A. A., & Zumbo, B. D. (2006). Understanding parameter invariance in unidimensional IRT models. Educational and Psychological Measurement, 66(1), 63–84.
Article Google Scholar
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika monograph No. 17. Richmond, VA: Psychometric Society.
Schilling, S., & Bock, R. D. (2005). High-dimensional maximum marginal likelihood item factor analysis by adaptive quadrature. Psychometrika, 70(3), 533–555.
Google Scholar
Thissen, D., Liu, Y., Magnus, B., & Quinn, H. (2015). Extending the use of multidimensional IRT calibration as projection: Many-to-one linking and linear computation of projected scores. In Quantitative psychology research (pp. 1–16). Springer.
Thissen, D., & Steinberg, L. (2009). Item response theory. In R. Millsap & A. Maydeu-Olivares (Eds.), The sage handbook of quantitative methods in psychology (pp. 148–177). London: Sage Publications.
Chapter Google Scholar
Thissen, D., Steinberg, L., & Kuang, D. (2002). Quick and easy implementation of the Benjamini-Hochberg procedure for controlling the false positive rate in multiple comparisons. Journal of Educational and Behavioral Statistics, 27(1), 77–83.
Article Google Scholar
Thissen, D., Steinberg, L., & Wainer, H. (1993). Detection of differential item functioning using the parameters of item response models. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 67–113). Hillsdale, NJ: Lawrence Erlbaum Associates.
Google Scholar
Thissen, D., Varni, J. W., Stucky, B. D., Liu, Y., Irwin, D. E., & DeWalt, D. A. (2011). Using the PedsQLtm 3.0 asthma module to obtain scores comparable with those of the PROMIS pediatric asthma impact scale (PAIS). Quality of Life Research, 20(9), 1497–1505.
Article PubMed PubMed Central Google Scholar
van der Vaart, A. W. (2000). Asymptotic statistics. New York: Cambridge University Press.
Google Scholar
Venables, W. N., & Ripley, B. D. (2002). Modern applied statistics with S (4th ed.). New York: Springer. (ISBN 0-387-95457-0).
Book Google Scholar
von Davier, M., & von Davier, A. A. (2007). A unified approach to IRT scale linking and scale transformations. Methodology, 3(3), 115–124.
Article Google Scholar
Wollack, J. A., Cohen, A. S., & Wells, C. S. (2003). A method for maintaining scale stability in the presence of test speededness. Journal of Educational Measurement, 40(4), 307–330.
Article Google Scholar
Yang, J. S., Hansen, M., & Cai, L. (2012). Characterizing sources of uncertainty in item response theory scale scores. Educational and psychological measurement, 72(2), 264–290.
Article PubMed Google Scholar
Zhao, Y., & Joe, H. (2005). Composite likelihood estimation in multivariate data analysis. Canadian Journal of Statistics, 33(3), 335–356.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Human Development and Quantitative Methodology, University of Maryland, College Park, USA
Yang Liu & Ji Seung Yang
Department of Psychology, University of South Carolina, Columbia, USA
Alberto Maydeu-Olivares
Department of Psychology, University of Barcelona, Barcelona, Spain
Alberto Maydeu-Olivares

Authors

Yang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ji Seung Yang
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Maydeu-Olivares
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yang Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors would like to thank Dr. David Thissen from the Department of Psychology at the University of North Carolina at Chapel Hill for his feedback and suggestions about this work. The participation of Ji Seung Yang was supported by the National Science Foundation under Grant EHR-1534846.. The participation of Alberto Maydeu-Olivares was supported by the National Science Foundation under Grant SES-1659936.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 218 KB)

Supplementary material 2 (dat 19 KB)

Supplementary material 3 (out 13 KB)

Supplementary material 4 (R 15 KB)

Supplementary material 5 (out 14 KB)

Appendix A The Asymptotic Distribution of Residuals

1.1 A.1 The Pseudo-Maximum Likelihood Estimator

Under the setup of RR and Conditions A1–A6 of Gong and Samaniego (1981), the following expansion of the n-sample likelihood equation (Eq. 4) is defined for $(\hat{\varvec{\xi }}, \hat{\varvec{\eta }}){}^\top $ in some neighborhood of $(\underline{\varvec{\xi }}, \underline{\varvec{\eta }}){}^\top $:

$$\begin{aligned} \mathbf{0}= & {} \frac{1}{n}\sum _{i=1}^{n}\hat{\mathbf{g}}(\mathbf{y}_i) = \frac{1}{n}\sum _{i=1}^{n}\underline{\mathbf{g}}(\mathbf{y}_i) + \left[ \frac{1}{n}\sum _{i=1}^{n}\underline{\mathbf{H}}_{\varvec{\eta \xi }}(\mathbf{y}_i)\right] (\hat{\varvec{\xi }} - \underline{\varvec{\xi }}) \nonumber \\&+\left[ \frac{1}{n}\sum _{i=1}^{n}\underline{\mathbf{H}}_{ \varvec{\eta }}(\mathbf{y}_i)\right] (\hat{\varvec{\eta }} - \underline{\varvec{\eta }}) + \mathbf{R}_n, \end{aligned}$$

(24)

in which $\mathbf{g}(\mathbf{y}_i; {\varvec{\xi }}, {\varvec{\eta }}) = \partial \log \pi (\mathbf{y}_{i};{\varvec{\xi }}, {\varvec{\eta }})/\partial {\varvec{\eta }}$, $\mathbf{H}_{\varvec{\eta }}(\mathbf{y}_i; {\varvec{\xi }}, {\varvec{\eta }}) = \partial ^2\log \pi (\mathbf{y}_{i};{\varvec{\xi }}, {\varvec{\eta }})/\partial {\varvec{\eta }}\partial {\varvec{\eta }}{}^\top $, $\mathbf{H}_{\varvec{\eta \xi }}(\mathbf{y}_i; {\varvec{\xi }}, {\varvec{\eta }}) = \partial ^2\log \pi (\mathbf{y}_{i};{\varvec{\xi }}, {\varvec{\eta }})/\partial {\varvec{\eta }}\partial {\varvec{\xi }}{}^\top $, and $\mathbf{R}_n$ denotes the remainder term. As usual, the hat symbol and underline indicate evaluations at the pseudo-ML estimates and the true values of parameters, respectively. If $\frac{1}{n}\sum _{i=1}^{n}\underline{\mathbf{H}}_{ \varvec{\eta }}(\mathbf{y}_i)$ is invertible, then Eq. 24 can be rewritten as

$$\begin{aligned} \sqrt{n}(\hat{\varvec{\eta }} - {\underline{\varvec{\eta }}})= & {} \left[ -\frac{1}{n}\sum _{i=1}^{n} {\underline{\mathbf {H}}}_{\varvec{\eta }}({\mathbf {y}}_i)\right] ^{-1} \left\{ \frac{1}{\sqrt{n}}\sum _{i=1}^{n}{\underline{\mathbf {g}}} (\mathbf {y}_i) \right. \nonumber \\&\quad \left. + \sqrt{\frac{n}{n'}}\left[ \frac{1}{n}\sum _{i=1}^{n}\underline{\mathbf {H}}_{ \varvec{\eta \xi }}(\mathbf {y}_i)\right] \sqrt{n'}(\hat{\varvec{\xi }} - {\underline{\varvec{\xi }}}) + \sqrt{n}{\mathbf {R}}_n\right\} . \end{aligned}$$

(25)

The assumed regularity conditions guarantee that

$$\begin{aligned}&-\frac{1}{n}\sum _{i=1}^{n}{\underline{\mathbf {H}}}_{\varvec{\eta }} (\mathbf {y}_i){}{\mathop {\rightarrow }\limits ^{p}} \underline{\varvec{\mathcal {I}}}_{\varvec{\eta }},\ -\frac{1}{n}\sum _{i=1}^{n}\underline{\mathbf {H}}_{\varvec{\eta \xi }} (\mathbf {y}_i){}{\mathop {\rightarrow }\limits ^{p}} \underline{\varvec{\mathcal {I}}}_{\varvec{\eta \xi }},\ \frac{n}{n'}\rightarrow {\underline{c}},\ {\sqrt{n}} \mathbf {R}_n{}{\mathop {\rightarrow }\limits ^{p}}\mathbf {0},\nonumber \\&\frac{1}{\sqrt{n}}\sum _{i=1}^{n}\underline{\mathbf {g}}(\mathbf {y}_i){} {\mathop {\rightarrow }\limits ^{d}}\mathcal {N}(\mathbf {0}, \underline{\varvec{\mathcal {I}}}_{\varvec{\eta }}),\ \text{ and } \sqrt{n'}(\hat{\varvec{\xi }} - \underline{\varvec{\xi }}){} {\mathop {\rightarrow }\limits ^{d}}\mathcal {N}(\mathbf {0}, \underline{\varvec{\Omega }}_{\varvec{\xi }}). \end{aligned}$$

(26)

Equation 6 is established by combining Eqs. 25 and 26 and applying Slutsky’s lemma.

1.2 A.2 The Asymptotic Covariance Matrix of Residuals

It is straightforward to show that $\mathbf{p} - \underline{\varvec{\pi }}$, $\hat{\varvec{\xi }} - \underline{\varvec{\xi }}$, and $\hat{\varvec{\eta }} - \underline{\varvec{\eta }}$ are jointly asymptotically normal when the model is correctly specified:

$$\begin{aligned} \sqrt{n}\begin{pmatrix} \mathbf{p} - \underline{\varvec{\pi }}\\ \hat{\varvec{\xi }} - \underline{\varvec{\xi }}\\ \hat{\varvec{\eta }} - \underline{\varvec{\eta }} \end{pmatrix}{}{\mathop {\rightarrow }\limits ^{d}}\mathcal{N}\left( \mathbf{0}, \begin{pmatrix} \underline{\varvec{\Gamma }} &{} \underline{\varvec{\Theta }}{}^\top _{21} &{} \underline{\varvec{\Theta }}{}^\top _{31}\\ \underline{\varvec{\Theta }}_{21} &{} \underline{c}\underline{\varvec{\Omega }}_{\varvec{\xi }} &{} \underline{\varvec{\Omega }}{}^\top _{\varvec{\eta \xi }}\\ \underline{\varvec{\Theta }}_{31} &{} \underline{\varvec{\Omega }}_{\varvec{\eta \xi }} &{} \underline{\varvec{\Omega }}_{\varvec{\eta }} &{} \\ \end{pmatrix} \right) , \end{aligned}$$

(27)

in which $\underline{\varvec{\Theta }}_{21} = \mathbf {0}$ because $\hat{\varvec{\xi }}$ and $\mathbf {Y}'$ are independent, and

$$\begin{aligned} {\underline{\varvec{\Theta }}_{31}} = {\underline{\varvec{\mathcal {I}}}}_{\varvec{\eta }}^{-1}{\underline{\varvec{\Delta }}}{}^\top _{\varvec{\eta }}{\underline{\mathbf {D}}}^{-1}\text{ Cov }(\sqrt{n}{} \mathbf {p}) = {\underline{\varvec{\mathcal {I}}}}_{\varvec{\eta }}^{-1}{\underline{\varvec{\Delta }}}{}^\top _{\varvec{\eta }}. \end{aligned}$$

(28)

By the Delta method,

$$\begin{aligned} \mathbf {e} = \begin{pmatrix} \mathbf {I}_{C} &{}{} -{\underline{\varvec{\Delta }}}_{\varvec{\xi }} &{}{} -{\underline{\varvec{\Delta }}}_{\varvec{\eta }}\\ \end{pmatrix} \begin{pmatrix} \mathbf {p} - \underline{\varvec{\pi }}\\ \hat{\varvec{\xi }} - \underline{\varvec{\xi }}\\ \hat{\varvec{\eta }} - \underline{\varvec{\eta }} \end{pmatrix} + o_p(1), \end{aligned}$$

(29)

in which $\mathbf{I}_{C}$ denotes a $C\times C$ identity matrix. Combining Eqs. 27 and 29 yield

$$\begin{aligned} \varvec{\Sigma }({\varvec{\xi }},{\varvec{\eta }},c) = \begin{pmatrix} \mathbf {I}_{C} &{}{} -\varvec{\Delta }_{\varvec{\xi }} &{}{} -\varvec{\Delta }_{\varvec{\eta }}\\ \end{pmatrix} \begin{pmatrix} \varvec{\Gamma } &{}{} \varvec{\Theta }_{21}^\top &{}{} \varvec{\Theta }_{31}^\top \\[1mm] \varvec{\Theta }_{21} &{}{} c{\varvec{\Omega }}_{\varvec{\xi }} &{}{} {\varvec{\Omega }}{}^\top _{\varvec{\eta \xi }}\\[1mm] \varvec{\Theta }_{31} &{}{} {\varvec{\Omega }}_{\varvec{\eta \xi }} &{}{} {\varvec{\Omega }}_{\varvec{\eta }} &{}{} \\[1mm] \end{pmatrix} \begin{pmatrix} \mathbf {I}_{C} \\[1mm] -\varvec{\Delta }{}^\top _{\varvec{\xi }} \\[1mm] -\varvec{\Delta }{}^\top _{\varvec{\eta }}\\ \end{pmatrix}, \end{aligned}$$

(30)

which simplifies to Eq. 14.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, Y., Yang, J.S. & Maydeu-Olivares, A. Restricted Recalibration of Item Response Theory Models. Psychometrika 84, 529–553 (2019). https://doi.org/10.1007/s11336-019-09667-4

Download citation

Received: 16 October 2017
Published: 20 March 2019
Issue Date: 15 June 2019
DOI: https://doi.org/10.1007/s11336-019-09667-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Restricted Recalibration of Item Response Theory Models

Abstract

Access this article

Similar content being viewed by others

Some recommendations for developing multidimensional computerized adaptive tests for patient-reported outcomes

Assessing the Accuracy of Errors of Measurement. Implications for Assessing Reliable Change in Clinical settings

A comparison of Monte Carlo methods for computing marginal likelihoods of item response theory models

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Electronic supplementary material

Supplementary material 1 (pdf 218 KB)

Supplementary material 2 (dat 19 KB)

Supplementary material 3 (out 13 KB)

Supplementary material 4 (R 15 KB)

Supplementary material 5 (out 14 KB)

Appendix A The Asymptotic Distribution of Residuals

1.1 A.1 The Pseudo-Maximum Likelihood Estimator

1.2 A.2 The Asymptotic Covariance Matrix of Residuals

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Restricted Recalibration of Item Response Theory Models

Abstract

Access this article

Similar content being viewed by others

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Electronic supplementary material

Appendix A The Asymptotic Distribution of Residuals

Appendix A The Asymptotic Distribution of Residuals

1.1 A.1 The Pseudo-Maximum Likelihood Estimator

1.2 A.2 The Asymptotic Covariance Matrix of Residuals

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation