Abstract
The nonequivalent groups with anchor test (NEAT) design, also known as the common item, nonequivalent groups design (Kolen & Brennan, 2004), is used in equating scores of several large-scale tests such as the SAT® and the certification examinations conducted by the American Society for Quality. The two observed-score equating (OSE) methods popular with the NEAT design are chain equating (CE) and poststratification equating (PSE). Here, we consider their nonlinear versions, that is, the frequency estimation equipercentile equating (FEEE) for PSE, and the chained equipercentile equating (CEE) method for CE (see Kolen & Brennan, 2004, for further details on these methods).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Note that X and Y are often different forms of the same test (for example, forms A and B of SAT) rather than being different tests. We call them tests rather than test forms for simplicity. The new test is often referred to as the test/form to be equated, and the old test is referred to as the test/form to be equated to.
References
Braun, H. I., & Holland, P. W. (1982). Observed-score test equating: A mathematical analysis of some ETS equating procedures. In P. W. Holland & D. B. Rubin (Eds.), Test equating (pp. 9–49). New York, NY: Academic Press.
Harris, D. J., & Kolen, M. J. (1990). A comparison of two equipercentile equating methods for common item equating. Educational and Psychological Measurement, 50, 61–71.
Holland, P. W., Sinharay, S., von Davier, A. A., & Han, N. (2008). An approach to evaluating the missing data assumptions of the chain and post-stratification equating methods for the NEAT design. Journal of Educational Measurement, 45, 17–43.
Holland, P. W., & Thayer, D. T. (2000). Univariate and bivariate loglinear models for discrete test score distributions. Journal of Educational and Behavioral Statistics, 25, 133–183.
Klein, L. W., & Jarjoura, D. (1985). The importance of content representation for common-item equating with nonrandom groups. Journal of Educational Measurement, 22, 197–206.
Kolen, M. J., & Brennan, R. J. (2004). Test equating, scaling, and linking: Methods and practices (2nd ed.). New York, NY: Springer.
Livingston, S. A. (2004). Equating test scores (without IRT). Princeton, NJ: ETS.
Livingston, S. A., Dorans, N. J., & Wright, N. K. (1990). What combination of sampling and equating methods works best? Applied Measurement in Education, 3, 73–95.
Marco, G. L., Petersen, N. S., & Stewart, E. E. (1983). A test of the adequacy of curvilinear score equating models. In D. Weiss (Ed.), New horizons in testing: Latent trait test theory and computerized adaptive testing. New York, NY: Academic Press.
Puhan, G. (2010). A comparison of chained linear and post stratification linear equating under different testing conditions. Journal of Educational Measurement, 47, 54–75.
Ricker, K., & von Davier, A. A. (2007). The impact of anchor test length on equating results in a non-equivalent group design. Princeton, NJ: ETS (ETS Research Rep. No. RR-07-44).
Sinharay, S., & Holland, P. W. (2007). Is it necessary to make anchor tests mini-versions of the tests being equated or can some restrictions be relaxed? Journal of Educational Measurement, 44, 249–275.
Tong, Y., & Kolen, M. J. (2005). Assessing equating results on different equating criteria. Applied Psychological Measurement, 29, 418–432.
von Davier, A. A. (2003). Notes on linear equating methods for the non-equivalent groups design. Princeton, NJ: ETS (ETS Research Rep. No. RR-03-24).
von Davier, A. A., Holland, P. W., Livingston, S. A., Casabianca, J., Grant, M. C., & Martin, K. (2006). An evaluation of the kernel equating method: A special study with pseudo-tests constructed from real test data. Princeton, NJ: ETS (ETS Research Rep. No. RR-06-02).
von Davier, A. A., Holland, P. W., & Thayer, D. T. (2003). Population invariance and chain versus poststratification methods for equating and test linking. In N. Dorans (Ed.), Population invariance of score linking: Theory and applications to Advanced Placement Program ® Examinations. Princeton, NJ: ETS (ETS Research Rep. No. RR-03-27).
von Davier, A. A., Holland, P. W., & Thayer, D. T. (2004a). The kernel method of test equating. New York, NY: Springer.
von Davier, A. A., Holland, P. W., & Thayer, D. T. (2004b). The chain and poststratification methods for observed-score equating: Their relationship to population invariance. Journal of Educational Measurement, 41, 15–32.
Wang, T., Lee, W., Brennan, R. J., & Kolen, M. J. (2008). A comparison of the frequency estimation and chained equipercentile methods under the common-item non-equivalent groups design. Applied Psychological Measurement, 32, 632–651.
Acknowledgments
This work was funded by Educational Testing Service. The author thanks Dan Eignor, Paul W. Holland, Rick Morgan, and Skip Livingston for helpful comments, and Ayleen Stelhorn and Kim Fryer for editorial help. Any opinions expressed here are those of the author and not necessarily of Educational Testing Service.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this paper
Cite this paper
Sinharay, S. (2011). Chain Equipercentile Equating and Frequency Estimation Equipercentile Equating: Comparisons Based on Real and Simulated Data. In: Dorans, N., Sinharay, S. (eds) Looking Back. Lecture Notes in Statistics(), vol 202. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-9389-2_11
Download citation
DOI: https://doi.org/10.1007/978-1-4419-9389-2_11
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-9388-5
Online ISBN: 978-1-4419-9389-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)