KeywordsPredictive Validity Criterion Validity Face Validity Concurrent Validity Internal Reliability
Scale development is an essential stage in the assessment of constructs and variables in behavior medicine and in any social and health science. Scales are used for assessment of self-reported variables including mood, daily disability, various types of symptoms, adherence to recommended diet, etc. Though there is no explicit “rule” for the stages of scale development, certain steps need to be included for claiming that a scale is reliable and valid. The reliability of a scale is very important and refers to its repeatability and lack of measurement error. This is tested by internal reliability tests (Cronbach’s α) and by a test-retest reliability of scores over time. Validity is an essential aspect of a scale and refers to the extent to which it measures what it claims to measure. This is tested by several manners including “face validity,” concurrent validity, construct validity, and criterion validity. When developing a scale, it is essential to have a clear definition of the concept it refers to. Thus, for example, an anxiety scale should not have items assessing depression since these are not the same construct. After choosing an acceptable definition for the construct, a group of “experts” on the construct meets to provide items or even topics related to the construct, from which the researcher creates items. The chosen items will reflect the most common topics or items suggested by the expert panel. The panel can be experts from the field (psychologists, physicians, etc.) and patients who experienced the issue under investigation, thus reflecting experienced “experts.” Then, the investigator can ask another group of experts or patients to rate the relevance of each item to the construct, reflecting face validity. The items with a mean relevance above a chosen criterion will be selected for the preliminary scale. Next, the researcher administers the scale to a larger sample, with theoretically relevant additional tests. This will enable to test the internal reliability, concurrent validity against another scale assessing the same construct, and the construct validity against scales assessing theoretically related constructs. Finally, an acceptable criterion (e.g., ill vs. healthy sample) will enable to test the scale’s criterion validity. Predictive validity can also be tested by examining whether the scale’s scores predict a certain event or outcome in the future, beyond the effects of known confounders. For example, Barefoot et al. (1989) tested the predictive validity of a shorter hostility scale derived from the original Ho scale in the Minnesota Multiphasic Personality Inventory. They found that the brief scale which included cynicism, hostile affect, and aggressive responding predicted death better than the original full Ho scale, supporting the brief scale’s predictive validity. These steps are needed for scale development, to verify a scale’s reliability and validity, for use in research and clinical evaluations.