Skip to main content

An Ideal Measurement

  • Chapter
  • First Online:
Educational Measurement for Applied Researchers

Abstract

When one undertakes the measurement of a latent trait, what are the desirable properties one would like to have for the measures?

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Reference

  • Rasch G (1960) Probabilistic models for some intelligence and attainment tests. Danish Institute for Educational Research, Copenhagen

    Google Scholar 

Further Reading

  • Bond TG, Fox CM (2007) Applying the Rasch model: fundamental measurement in the human sciences, 2nd edn. Lawrence Erlbaum Associates, Mahwah, NJ

    Google Scholar 

  • Engelhard G (2013) Invariant measurement: using Rasch models in the social, behavioural, and health sciences. Routledge, New York, NY

    Google Scholar 

  • Wilson M (2005) Constructing measures: an item response modeling approach. Lawrence Erlbaum Associates, Mahwah, NJ

    Google Scholar 

  • Wright BD, Masters GN (1982) Rating scale analysis. Mesa Press, Chicago

    Google Scholar 

  • Wright BD, Stone MH (1999) Measurement essentials. Wide Range, Inc., Wilmington, DE. http://www.rasch.org/measess/me-all.pdf

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Margaret Wu .

Appendices

Hands-on Practices

Task 1

Use simulation to generate raw scores for students on an easy test and a hard test.

  1. Q1.

    Plot the two test scores on a graph

  2. Q2.

    Apply a logistic transformation to the raw scores as follows:

  1. Step 1:

    Compute percentage correct from the raw scores (raw score divided by possible maximum score). Let p denote percentage correct.

  2. Step 2:

    Compute transformed score by applying transformation, log(p/(1 − p)), where log is the natural logarithm. The ratio, p/(1 − p), is referred to as an “odds”. The results from the transformation of log(p/(1 − p)) are said to be in the “log of odds unit” (abbreviated as “logit”)

  3. Step 3:

    Plot the two transformed scores on a graph

Discuss the shapes of the two graphs in terms of measurement invariance. Which graph is closer to a straight line?

Note: This hands-on practice is to demonstrate IRT as viewed as a transformation of the raw scores. However, the actual mathematical modelling of IRT is at the individual item and individual person level, not at the test score level. In IRT software programs, often logistic transformations applied to the test scores or to item scores (percentage of students getting an item right), as shown in this hands-on practice, are used to provide initial values of person and item parameters.

Task 2

Investigate the relationship between raw scores and transformed logit scores. For example, if a test has a maximum score of 30, plot raw scores (between 0 and 30) against transformed scores. What are your observations in terms of the distances between raw scores and between logit scores? Is the relationship between raw scores and logit scores a linear one? If not, is there a range between which the relationship is approximately linear?

Discussion Points

  1. (1)

    For what purposes of measurement would raw scores be sufficient? For what purposes of measurement should IRT be applied?

  2. (2)

    Based on the presentation in Chaps. 5 and 6, what do you think are the differences between classical test theory and item response theory?

  3. (3)

    The illustration of the principles for estimating ability (as shown from Figs. 6.8, 6.9, 6.10 and 6.11) relies on a response pattern that shows more items correct for easy items, and fewer items correct for difficult items. In this way one can identify the region where there are about equal numbers of correct and incorrect items. What happens if there is no clear pattern of item responses, such as a random scattering of incorrect items over the low to high scale, so that there is no clear region where the student’s ability might be?

Exercises

  1. Q1.

    As percentages, raw scores have a minimum of 0 and a maximum of 100. What is the minimum and maximum of logits? (logit is defined as in the Hands-on Practice section).

  2. Q2.

    When percentage (p) is 50%, what is the value of the transformed logit?

  3. Q3.

    Consider two raw scores expressed in percentages, p1 and p2, where p2 is greater than p1. Let t1 and t2 denote the transformed logit scores of p1 and p2 respectively. Which of the following option(s) do you think are appropriate in relation to the relative magnitude of t1 and t2?

    t1 is greater than t2

    t2 is greater than t1

    One cannot say which is larger, as it depends on whether t1 and t2 are positive or negative

    One cannot say which is larger, as it depends on whether p1 and p2 are below or above average

  4. Q4.

    The following shows the response pattern of a student. Can you estimate the student’s ability?

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Nature Singapore Pte Ltd.

About this chapter

Cite this chapter

Wu, M., Tam, H.P., Jen, TH. (2016). An Ideal Measurement. In: Educational Measurement for Applied Researchers. Springer, Singapore. https://doi.org/10.1007/978-981-10-3302-5_6

Download citation

Publish with us

Policies and ethics