Skip to main content
Log in

Restricted Recalibration of Item Response Theory Models

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

In item response theory (IRT), it is often necessary to perform restricted recalibration (RR) of the model: A set of (focal) parameters is estimated holding a set of (nuisance) parameters fixed. Typical applications of RR include expanding an existing item bank, linking multiple test forms, and associating constructs measured by separately calibrated tests. In the current work, we provide full statistical theory for RR of IRT models under the framework of pseudo-maximum likelihood estimation. We describe the standard error calculation for the focal parameters, the assessment of overall goodness-of-fit (GOF) of the model, and the identification of misfitting items. We report a simulation study to evaluate the performance of these methods in the scenario of adding a new item to an existing test. Parameter recovery for the focal parameters as well as Type I error and power of the proposed tests are examined. An empirical example is also included, in which we validate the pediatric fatigue short-form scale in the Patient-Reported Outcome Measurement Information System (PROMIS), compute global and local GOF statistics, and update parameters for the misfitting items.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. We assume that the response patterns have been sorted in an arbitrary but fixed order.

  2. When the items are ordinal, the statistic defined by Eq. 20 is different from the original \(M_2\) statistic proposed by Maydeu-Olivares and Joe 2006. It is the same as the \(M_{ord}\) statistic in Maydeu-Olivares and Joe 2014 and the \(M_2^*\) statistic in Cai and Hansen 2013.

  3. Because the nuisance parameters \(\varvec{\xi }\) were estimated by ML from the previous data \(\mathbf{Y}'\), \({\varvec{\Omega }}_{\varvec{\xi }}\) amounts to the inverse Fisher information matrix with respect to the intercept and slope parameters of the first 9 items.

References

  • Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B (Methodological), 57, 289–300.

    Article  Google Scholar 

  • Birnbaum, A. (1968). Some latent train models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 395–479). Reading, MA: Addison-Wesley.

    Google Scholar 

  • Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46(4), 443–459.

    Article  Google Scholar 

  • Bock, R. D., & Lieberman, M. (1970). Fitting a response model for \(n\) dichotomously scored items. Psychometrika, 35(2), 179–197.

    Article  Google Scholar 

  • Bock, R. D., & Zimowski, M. F. (1997). Multiple group irt. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 433–448). New York: Springer.

    Chapter  Google Scholar 

  • Box, G. E. P. (1954). Some theorems on quadratic forms applied in the study of analysis of variance problems: I. Effect of inequality of variance in the one-way classification. The Annals of Mathematical Statistics, 25(2), 290–302.

    Article  Google Scholar 

  • Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Bradlow, E. T., Wainer, H., & Wang, X. (1999). A Bayesian random effects model for testlets. Psychometrika, 64(2), 153–168.

    Article  Google Scholar 

  • Breithaupt, K., Ariel, A. A., & Hare, D. R. (2010). Assembling an inventory of multistage adaptive testing systems. In W. J. van der Linden & C. A. Glas (Eds.), Elements of adaptive testing (pp. 247–266). New York, NY: Springer.

    Google Scholar 

  • Browne, M. W. (2000). Cross-validation methods. Journal of Mathematical Psychology, 44(1), 108–132.

    Article  PubMed  Google Scholar 

  • Cai, L., & Hansen, M. (2013). Limited-information goodness-of-fit testing of hierarchical item factor models. British Journal of Mathematical and Statistical Psychology, 66(2), 245–276.

    Article  PubMed  Google Scholar 

  • Cai, L., Maydeu-Olivares, A., Coffman, D. L., & Thissen, D. (2006). Limited-information goodness-of-fit testing of item response theory models for sparse 2p tables. British Journal of Mathematical and Statistical Psychology, 59(1), 173–194.

    Article  PubMed  Google Scholar 

  • Cheng, Y., & Yuan, K.-H. (2010). The impact of fallible item parameter estimates on latent trait recovery. Psychometrika, 75, 280–291.

    Article  PubMed  PubMed Central  Google Scholar 

  • Cochran, W. G. (1952). The \({\chi }^{2}\) test of goodness of fit. The Annals of Mathematical Statistics, 23(3), 315–345.

    Article  Google Scholar 

  • Cressie, N., & Read, T. R. (1984). Multinomial goodness-of-fit tests. Journal of the Royal Statistical Society, Series B (Methodological), 46(3), 440–464.

    Article  Google Scholar 

  • Curran, P. J., & Hussong, A. M. (2009). Integrative data analysis: The simultaneous analysis of multiple data sets. Psychological Methods, 14(2), 81–100.

    Article  PubMed  PubMed Central  Google Scholar 

  • Drasgow, F., Levine, M. V., Tsien, S., Williams, B., & Mead, A. D. (1995). Fitting polytomous item response theory models to multiple-choice tests. Applied Psychological Measurement, 19(2), 143–166.

    Article  Google Scholar 

  • Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum.

    Google Scholar 

  • Fox, J.-P. (2005). Multilevel irt using dichotomous and polytomous response data. British Journal of Mathematical and Statistical Psychology, 58(1), 145–172.

    Article  PubMed  Google Scholar 

  • Glas, C. A. (1988). The derivation of some tests for the Rasch model from the multinomial distribution. Psychometrika, 53(4), 525–546.

    Article  Google Scholar 

  • Glas, C. A. (1999). Modification indices for the 2-pl and the nominal response model. Psychometrika, 64(3), 273–294.

    Article  Google Scholar 

  • Glas, C. A., & Suárez Falcón, J. C. (2003). A comparison of item-fit statistics for the three-parameter logistic model. Applied Psychological Measurement, 27(2), 87–106. https://doi.org/10.1177/0146621602250530.

    Article  Google Scholar 

  • Gong, G., & Samaniego, F. J. (1981). Pseudo maximum likelihood estimation: Theory and applications. The Annals of Statistics, 9(4), 861–869.

    Article  Google Scholar 

  • Gunsjö, A. (1994). Faktoranalys av ordinala variabler. Stockholm: Acta Universitatis Upsaliensis.

    Google Scholar 

  • Haberman, S. J. (2006). Adaptive quadrature for item response models. ETS Research Report Series, 2006(2), 1–10.

    Article  Google Scholar 

  • Haberman, S. J., & Sinharay, S. (2013). Generalized residuals for general models for contingency tables with application to item response theory. Journal of the American Statistical Association, 108(504), 1435–1444.

    Article  Google Scholar 

  • Haberman, S. J., Sinharay, S., & Chon, K. H. (2013). Assessing item fit for unidimensional item response theory models using residuals from estimated item response functions. Psychometrika, 78(3), 417–440.

    Article  PubMed  Google Scholar 

  • Haley, S. M., Ni, P., Jette, A. M., Tao, W., Moed, R., Meyers, D., et al. (2009). Replenishing a computerized adaptive test of patient-reported daily activity functioning. Quality of Life Research, 18(4), 461–471.

    Article  PubMed  PubMed Central  Google Scholar 

  • Hofer, S. M., & Piccinin, A. M. (2009). Integrative data analysis through coordination of measurement and analysis protocol across independent longitudinal studies. Psychological Methods, 14(2), 150–164.

  • Joe, H., & Maydeu-Olivares, A. (2006). On the asymptotic distribution of pearson’s x2 in cross-validation samples. Psychometrika, 71(3), 587–592.

    Article  Google Scholar 

  • Joe, H., & Maydeu-Olivares, A. (2010). A general family of limited information goodness-of-fit statistics for multinomial data. Psychometrika, 75(3), 393–419.

    Article  Google Scholar 

  • Jöreskog, K. G., & Moustaki, I. (2001). Factor analysis of ordinal variables: A comparison of three approaches. Multivariate Behavioral Research, 36(3), 347–387.

    Article  PubMed  Google Scholar 

  • Kim, S. (2006). A comparative study of IRT fixed parameter calibration methods. Journal of Educational Measurement, 43(4), 355–381.

    Article  Google Scholar 

  • Lai, J.-S., Stucky, B. D., Thissen, D., Varni, J. W., DeWitt, E. M., Irwin, D. E., et al. (2013). Development and psychometric properties of the promisÂő pediatric fatigue item banks. Quality of Life Research, 22(9), 2417–2427. https://doi.org/10.1007/s11136-013-0357-1.

    Article  PubMed  Google Scholar 

  • Liu, Y., & Maydeu-Olivares, A. (2014). Identifying the source of misfit in item response theory models. Multivariate Behavioral Research, 49(4), 354–371.

    Article  PubMed  Google Scholar 

  • Liu, Y., & Thissen, D. (2012). Identifying local dependence with a score test statistic based on the bifactor logistic model. Applied Psychological Measurement, 36(8), 670–688.

    Article  Google Scholar 

  • Liu, Y., & Thissen, D. (2014). Comparing score tests and other local dependence diagnostics for the graded response model. British Journal of Mathematical and Statistical Psychology, 67(3), 496–513.

    Article  PubMed  Google Scholar 

  • Liu, Y., & Yang, J. S. (2017). Interval estimation of latent variable scores in item response theory. Journal of Educational and Behavioral Statistics. https://doi.org/10.3102/1076998617732764.

  • Liu, Y., & Yang, J. S. (2018). Bootstrap-calibrated interval estimates for latent variable scores in item response theory. Psychometrika, 83(2), 333–354.

    Article  PubMed  Google Scholar 

  • Luecht, R. M. (2006). Operational issues in computer-based testing. In D. Bartram & R. Hambleton (Eds.), Computer-based testing and the internet: Issues and advances (pp. 91–114). New York: Wiley.

    Google Scholar 

  • Magnus, J., & Neudecker, H. (1999). Matrix differential calculus with applications in statistics and econometrics. New York: Wiley.

    Google Scholar 

  • Maydeu-Olivares, A., & Joe, H. (2005). Limited-and full-information estimation and goodness-of-fit testing in \(2^{n}\) contingency tables: A unified framework. Journal of the American Statistical Association, 100(471), 1009–1020.

    Article  Google Scholar 

  • Maydeu-Olivares, A., & Joe, H. (2006). Limited information goodness-of-fit testing in multidimensional contingency tables. Psychometrika, 71(4), 713–732.

    Article  Google Scholar 

  • Maydeu-Olivares, A., & Joe, H. (2008). An overview of limited information goodness-of-fit testing in multidimensional contingency tables. In K. Shigemasu, A. Okada, T. Imaizumi, & T. Hoshino (Eds.), New trends in psychometrics (pp. 253–262). Tokyo: Universal Academy Press.

    Google Scholar 

  • Maydeu-Olivares, A., & Joe, H. (2014). Assessing approximate fit in categorical data analysis. Multivariate Behavioral Research, 49(4), 305–328.

    Article  PubMed  Google Scholar 

  • Maydeu-Olivares, A., & Liu, Y. (2015). Item diagnostics in multivariate discrete data. Psychological Methods, 20(2), 276–292.

    Article  PubMed  Google Scholar 

  • McDonald, R. P. (1981). The dimensionality of tests and items. British Journal of Mathematical and Statistical Psychology, 34(1), 100–117.

    Article  Google Scholar 

  • Meng, X.-L., & Wong, W. H. (1996). Simulating ratios of normalizing constants via a simple identity: A theoretical exploration. Statistica Sinica, 6(4), 831–860.

    Google Scholar 

  • Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58(4), 525–543.

    Article  Google Scholar 

  • Mosier, C. I. (1951). Symposium: The need and means of cross-validation. i. Problems and designs of cross-validation. Educational and Psychological Measurement, 11(1), 5–11.

    Article  Google Scholar 

  • Muthén, B. (1978). Contributions to factor analysis of dichotomous variables. Psychometrika, 43(4), 551–560.

    Article  Google Scholar 

  • Muthén, B. (1983). Latent variable structural equation modeling with categorical data. Journal of Econometrics, 22(1–2), 43–65.

    Article  Google Scholar 

  • Muthén, B. (1984). A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika, 49(1), 115–132.

    Article  Google Scholar 

  • Muthén, B. (1993). Goodness of fit with categorical and other nonnormal variables. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 205–234). Newbury Park, CA: Sage.

    Google Scholar 

  • Muthén, L. K., & Muthén, B. O. (1998–2017). Mplus user’s guide [Computer software manual]. Los Angeles, CA.

  • Parke, W. R. (1986). Pseudo maximum likelihood estimation: The asymptotic distribution. The Annals of Statistics, 14(1), 355–357.

    Article  Google Scholar 

  • R Core Team. (2018). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria. Retrieved from https://www.R-project.org/

  • Ranger, J., & Kuhn, J.-T. (2012). Assessing fit of item response models using the information matrix test. Journal of Educational Measurement, 49(3), 247–268.

    Article  Google Scholar 

  • Rao, C. R. (1973). Linear statistical inference and its applications. New York: Wiley.

    Book  Google Scholar 

  • Read, T. R. (1984). Closer asymptotic approximations for the distributions of the power divergence goodness-of-fit statistics. Annals of the Institute of Statistical Mathematics, 36(1), 59–69.

    Article  Google Scholar 

  • Reiser, M. (1996). Analysis of residuals for the multinomial item response model. Psychometrika, 61(3), 509–528.

    Article  Google Scholar 

  • Rubin, D. B. (1984). Bayesianly justifiable and relevant frequency calculations for the applies statistician. The Annals of Statistics, 12(4), 1151–1172.

    Article  Google Scholar 

  • Rupp, A. A. (2013). A systematic review of the methodology for person fit research in item response theory: Lessons about generalizability of inferences from the design of simulation studies. Psychological Test and Assessment Modeling, 55(1), 3–38.

    Google Scholar 

  • Rupp, A. A., & Zumbo, B. D. (2006). Understanding parameter invariance in unidimensional IRT models. Educational and Psychological Measurement, 66(1), 63–84.

    Article  Google Scholar 

  • Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika monograph No. 17. Richmond, VA: Psychometric Society.

  • Schilling, S., & Bock, R. D. (2005). High-dimensional maximum marginal likelihood item factor analysis by adaptive quadrature. Psychometrika, 70(3), 533–555.

    Google Scholar 

  • Thissen, D., Liu, Y., Magnus, B., & Quinn, H. (2015). Extending the use of multidimensional IRT calibration as projection: Many-to-one linking and linear computation of projected scores. In Quantitative psychology research (pp. 1–16). Springer.

  • Thissen, D., & Steinberg, L. (2009). Item response theory. In R. Millsap & A. Maydeu-Olivares (Eds.), The sage handbook of quantitative methods in psychology (pp. 148–177). London: Sage Publications.

    Chapter  Google Scholar 

  • Thissen, D., Steinberg, L., & Kuang, D. (2002). Quick and easy implementation of the Benjamini-Hochberg procedure for controlling the false positive rate in multiple comparisons. Journal of Educational and Behavioral Statistics, 27(1), 77–83.

    Article  Google Scholar 

  • Thissen, D., Steinberg, L., & Wainer, H. (1993). Detection of differential item functioning using the parameters of item response models. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 67–113). Hillsdale, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • Thissen, D., Varni, J. W., Stucky, B. D., Liu, Y., Irwin, D. E., & DeWalt, D. A. (2011). Using the PedsQLtm 3.0 asthma module to obtain scores comparable with those of the PROMIS pediatric asthma impact scale (PAIS). Quality of Life Research, 20(9), 1497–1505.

    Article  PubMed  PubMed Central  Google Scholar 

  • van der Vaart, A. W. (2000). Asymptotic statistics. New York: Cambridge University Press.

    Google Scholar 

  • Venables, W. N., & Ripley, B. D. (2002). Modern applied statistics with S (4th ed.). New York: Springer. (ISBN 0-387-95457-0).

    Book  Google Scholar 

  • von Davier, M., & von Davier, A. A. (2007). A unified approach to IRT scale linking and scale transformations. Methodology, 3(3), 115–124.

    Article  Google Scholar 

  • Wollack, J. A., Cohen, A. S., & Wells, C. S. (2003). A method for maintaining scale stability in the presence of test speededness. Journal of Educational Measurement, 40(4), 307–330.

    Article  Google Scholar 

  • Yang, J. S., Hansen, M., & Cai, L. (2012). Characterizing sources of uncertainty in item response theory scale scores. Educational and psychological measurement, 72(2), 264–290.

    Article  PubMed  Google Scholar 

  • Zhao, Y., & Joe, H. (2005). Composite likelihood estimation in multivariate data analysis. Canadian Journal of Statistics, 33(3), 335–356.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors would like to thank Dr. David Thissen from the Department of Psychology at the University of North Carolina at Chapel Hill for his feedback and suggestions about this work. The participation of Ji Seung Yang was supported by the National Science Foundation under Grant EHR-1534846.. The participation of Alberto Maydeu-Olivares was supported by the National Science Foundation under Grant SES-1659936.

Electronic supplementary material

Appendix A The Asymptotic Distribution of Residuals

Appendix A The Asymptotic Distribution of Residuals

1.1 A.1 The Pseudo-Maximum Likelihood Estimator

Under the setup of RR and Conditions A1–A6 of Gong and Samaniego (1981), the following expansion of the n-sample likelihood equation (Eq. 4) is defined for \((\hat{\varvec{\xi }}, \hat{\varvec{\eta }}){}^\top \) in some neighborhood of \((\underline{\varvec{\xi }}, \underline{\varvec{\eta }}){}^\top \):

$$\begin{aligned} \mathbf{0}= & {} \frac{1}{n}\sum _{i=1}^{n}\hat{\mathbf{g}}(\mathbf{y}_i) = \frac{1}{n}\sum _{i=1}^{n}\underline{\mathbf{g}}(\mathbf{y}_i) + \left[ \frac{1}{n}\sum _{i=1}^{n}\underline{\mathbf{H}}_{\varvec{\eta \xi }}(\mathbf{y}_i)\right] (\hat{\varvec{\xi }} - \underline{\varvec{\xi }}) \nonumber \\&+\left[ \frac{1}{n}\sum _{i=1}^{n}\underline{\mathbf{H}}_{ \varvec{\eta }}(\mathbf{y}_i)\right] (\hat{\varvec{\eta }} - \underline{\varvec{\eta }}) + \mathbf{R}_n, \end{aligned}$$
(24)

in which \(\mathbf{g}(\mathbf{y}_i; {\varvec{\xi }}, {\varvec{\eta }}) = \partial \log \pi (\mathbf{y}_{i};{\varvec{\xi }}, {\varvec{\eta }})/\partial {\varvec{\eta }}\), \(\mathbf{H}_{\varvec{\eta }}(\mathbf{y}_i; {\varvec{\xi }}, {\varvec{\eta }}) = \partial ^2\log \pi (\mathbf{y}_{i};{\varvec{\xi }}, {\varvec{\eta }})/\partial {\varvec{\eta }}\partial {\varvec{\eta }}{}^\top \), \(\mathbf{H}_{\varvec{\eta \xi }}(\mathbf{y}_i; {\varvec{\xi }}, {\varvec{\eta }}) = \partial ^2\log \pi (\mathbf{y}_{i};{\varvec{\xi }}, {\varvec{\eta }})/\partial {\varvec{\eta }}\partial {\varvec{\xi }}{}^\top \), and \(\mathbf{R}_n\) denotes the remainder term. As usual, the hat symbol and underline indicate evaluations at the pseudo-ML estimates and the true values of parameters, respectively. If \(\frac{1}{n}\sum _{i=1}^{n}\underline{\mathbf{H}}_{ \varvec{\eta }}(\mathbf{y}_i)\) is invertible, then Eq. 24 can be rewritten as

$$\begin{aligned} \sqrt{n}(\hat{\varvec{\eta }} - {\underline{\varvec{\eta }}})= & {} \left[ -\frac{1}{n}\sum _{i=1}^{n} {\underline{\mathbf {H}}}_{\varvec{\eta }}({\mathbf {y}}_i)\right] ^{-1} \left\{ \frac{1}{\sqrt{n}}\sum _{i=1}^{n}{\underline{\mathbf {g}}} (\mathbf {y}_i) \right. \nonumber \\&\quad \left. + \sqrt{\frac{n}{n'}}\left[ \frac{1}{n}\sum _{i=1}^{n}\underline{\mathbf {H}}_{ \varvec{\eta \xi }}(\mathbf {y}_i)\right] \sqrt{n'}(\hat{\varvec{\xi }} - {\underline{\varvec{\xi }}}) + \sqrt{n}{\mathbf {R}}_n\right\} . \end{aligned}$$
(25)

The assumed regularity conditions guarantee that

$$\begin{aligned}&-\frac{1}{n}\sum _{i=1}^{n}{\underline{\mathbf {H}}}_{\varvec{\eta }} (\mathbf {y}_i){}{\mathop {\rightarrow }\limits ^{p}} \underline{\varvec{\mathcal {I}}}_{\varvec{\eta }},\ -\frac{1}{n}\sum _{i=1}^{n}\underline{\mathbf {H}}_{\varvec{\eta \xi }} (\mathbf {y}_i){}{\mathop {\rightarrow }\limits ^{p}} \underline{\varvec{\mathcal {I}}}_{\varvec{\eta \xi }},\ \frac{n}{n'}\rightarrow {\underline{c}},\ {\sqrt{n}} \mathbf {R}_n{}{\mathop {\rightarrow }\limits ^{p}}\mathbf {0},\nonumber \\&\frac{1}{\sqrt{n}}\sum _{i=1}^{n}\underline{\mathbf {g}}(\mathbf {y}_i){} {\mathop {\rightarrow }\limits ^{d}}\mathcal {N}(\mathbf {0}, \underline{\varvec{\mathcal {I}}}_{\varvec{\eta }}),\ \text{ and } \sqrt{n'}(\hat{\varvec{\xi }} - \underline{\varvec{\xi }}){} {\mathop {\rightarrow }\limits ^{d}}\mathcal {N}(\mathbf {0}, \underline{\varvec{\Omega }}_{\varvec{\xi }}). \end{aligned}$$
(26)

Equation 6 is established by combining Eqs. 25 and 26 and applying Slutsky’s lemma.

1.2 A.2 The Asymptotic Covariance Matrix of Residuals

It is straightforward to show that \(\mathbf{p} - \underline{\varvec{\pi }}\), \(\hat{\varvec{\xi }} - \underline{\varvec{\xi }}\), and \(\hat{\varvec{\eta }} - \underline{\varvec{\eta }}\) are jointly asymptotically normal when the model is correctly specified:

$$\begin{aligned} \sqrt{n}\begin{pmatrix} \mathbf{p} - \underline{\varvec{\pi }}\\ \hat{\varvec{\xi }} - \underline{\varvec{\xi }}\\ \hat{\varvec{\eta }} - \underline{\varvec{\eta }} \end{pmatrix}{}{\mathop {\rightarrow }\limits ^{d}}\mathcal{N}\left( \mathbf{0}, \begin{pmatrix} \underline{\varvec{\Gamma }} &{} \underline{\varvec{\Theta }}{}^\top _{21} &{} \underline{\varvec{\Theta }}{}^\top _{31}\\ \underline{\varvec{\Theta }}_{21} &{} \underline{c}\underline{\varvec{\Omega }}_{\varvec{\xi }} &{} \underline{\varvec{\Omega }}{}^\top _{\varvec{\eta \xi }}\\ \underline{\varvec{\Theta }}_{31} &{} \underline{\varvec{\Omega }}_{\varvec{\eta \xi }} &{} \underline{\varvec{\Omega }}_{\varvec{\eta }} &{} \\ \end{pmatrix} \right) , \end{aligned}$$
(27)

in which \(\underline{\varvec{\Theta }}_{21} = \mathbf {0}\) because \(\hat{\varvec{\xi }}\) and \(\mathbf {Y}'\) are independent, and

$$\begin{aligned} {\underline{\varvec{\Theta }}_{31}} = {\underline{\varvec{\mathcal {I}}}}_{\varvec{\eta }}^{-1}{\underline{\varvec{\Delta }}}{}^\top _{\varvec{\eta }}{\underline{\mathbf {D}}}^{-1}\text{ Cov }(\sqrt{n}{} \mathbf {p}) = {\underline{\varvec{\mathcal {I}}}}_{\varvec{\eta }}^{-1}{\underline{\varvec{\Delta }}}{}^\top _{\varvec{\eta }}. \end{aligned}$$
(28)

By the Delta method,

$$\begin{aligned} \mathbf {e} = \begin{pmatrix} \mathbf {I}_{C} &{}{} -{\underline{\varvec{\Delta }}}_{\varvec{\xi }} &{}{} -{\underline{\varvec{\Delta }}}_{\varvec{\eta }}\\ \end{pmatrix} \begin{pmatrix} \mathbf {p} - \underline{\varvec{\pi }}\\ \hat{\varvec{\xi }} - \underline{\varvec{\xi }}\\ \hat{\varvec{\eta }} - \underline{\varvec{\eta }} \end{pmatrix} + o_p(1), \end{aligned}$$
(29)

in which \(\mathbf{I}_{C}\) denotes a \(C\times C\) identity matrix. Combining Eqs. 27 and 29 yield

$$\begin{aligned} \varvec{\Sigma }({\varvec{\xi }},{\varvec{\eta }},c) = \begin{pmatrix} \mathbf {I}_{C} &{}{} -\varvec{\Delta }_{\varvec{\xi }} &{}{} -\varvec{\Delta }_{\varvec{\eta }}\\ \end{pmatrix} \begin{pmatrix} \varvec{\Gamma } &{}{} \varvec{\Theta }_{21}^\top &{}{} \varvec{\Theta }_{31}^\top \\[1mm] \varvec{\Theta }_{21} &{}{} c{\varvec{\Omega }}_{\varvec{\xi }} &{}{} {\varvec{\Omega }}{}^\top _{\varvec{\eta \xi }}\\[1mm] \varvec{\Theta }_{31} &{}{} {\varvec{\Omega }}_{\varvec{\eta \xi }} &{}{} {\varvec{\Omega }}_{\varvec{\eta }} &{}{} \\[1mm] \end{pmatrix} \begin{pmatrix} \mathbf {I}_{C} \\[1mm] -\varvec{\Delta }{}^\top _{\varvec{\xi }} \\[1mm] -\varvec{\Delta }{}^\top _{\varvec{\eta }}\\ \end{pmatrix}, \end{aligned}$$
(30)

which simplifies to Eq. 14.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Y., Yang, J.S. & Maydeu-Olivares, A. Restricted Recalibration of Item Response Theory Models. Psychometrika 84, 529–553 (2019). https://doi.org/10.1007/s11336-019-09667-4

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-019-09667-4

Keywords

Navigation