Skip to main content

Part of the book series: Food Science Text Series ((FSTS))

Abstract

Scaling describes the application of numbers, or judgments that are converted to numerical values, to describe the perceived intensity of a sensory experience or the degree of liking or disliking for some experience or product. Scaling forms the basis for the sensory method of descriptive analysis. A variety of methods have been used for this purpose and with some caution, all work well in differentiating products. This chapter discusses theoretical issues as well as practical considerations in scaling.

The vital importance of knowing the properties and limitations of a measuring instrument can hardly be denied by most natural scientists. However, the use of many different scales for sensory measurement is common within food science; but very few of these have ever been validated… .

—(Land and Shepard, 1984, pp. 144–145)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 89.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • AACC (American Association of Cereal Chemists). 1986. Approved Methods of the AACC, Eighth Edition. Method 90–10. Baking quality of cake flour, rev. Oct. 1982. The American Association of Cereal Chemists, St. Paul, MN, pp. 1–4.

    Google Scholar 

  • Anderson, N. H. 1974. Algebraic models in perception. In: E. C. Carterette and M. P. Friedman (eds.), Handbook of Perception. Psychophysical Judgment and Measurement, Vol. 2. Academic, New York, pp. 215–298.

    Google Scholar 

  • Anderson, N. H. 1977. Note on functional measurement and data analysis. Perception and Psychophysics, 21, 201–215.

    Google Scholar 

  • ASTM. 2008a. Standard test method for unipolar magnitude estimation of sensory attributes. Designation E 1697-05. In: Annual Book of ASTM Standards, Vol. 15.08, End Use Products. American Society for Testing and Materials, Conshohocken, PA, pp. 122–131.

    Google Scholar 

  • ASTM. 2008b. Standard test method for sensory evaluation of red pepper heat. Designation E 1083-00. In: Annual Book of ASTM Standards, Vol. 15.08, End Use Products. American Society for Testing and Materials, Conshohocken, PA, pp. 49–53.

    Google Scholar 

  • Aust, L. B., Gacula, M. C., Beard, S. A. and Washam, R. W., II. 1985. Degree of difference test method in sensory evaluation of heterogeneous product types. Journal of Food Science, 50, 511–513.

    Google Scholar 

  • Baird, J. C. and Noma, E. 1978. Fundamentals of Scaling and Psychophysics. Wiley, New York.

    Google Scholar 

  • Banks, W. P. and Coleman, M. J. 1981. Two subjective scales of number. Perception and Psychophysics, 29, 95–105.

    CAS  Google Scholar 

  • Bartoshuk, L. M., Snyder, D. J. and Duffy, V. B. 2006. Hedonic gLMS: Valid comparisons for food liking/disliking across obesity, age, sex and PROP status. Paper presented at the 2006 Annual Meeting, Association for Chemoreception Sciences.

    Google Scholar 

  • Bartoshuk, L. M., Duffy, V. B., Fast, K., Green, B. G., Prutkin, J. and Snyder, D. J. 2003. Labeled scales (e.g. category, Likert, VAS) and invalid across-group comparisons: What we have learned from genetic variation in taste. Food Quality and Preference, 14, 125–138.

    Google Scholar 

  • Bartoshuk, L. M., Duffy, V. B., Green, B. G., Hoffman, H. J., Ko, C.-W., Lucchina, L. A., Marks, L. E., Snyder, D. J. and Weiffenbach, J. M. 2004a. Valid across-group comparisons with labeled scales: the gLMS versus magnitude matching. Physiology and Behavior, 82, 109–114.

    CAS  Google Scholar 

  • Bartoshuk, L. M., Duffy, V. B., Chapo, A. K., Fast, K., Yiee, J. H., Hoffman, H. J., Ko, C.-W. and Snyder, D. J. 2004b. From psychophysics to the clinic: Missteps and advances. Food Quality and Preference, 14, 617–632.

    Google Scholar 

  • Bartoshuk, L. M., Duffy, V. B., Fast, K., Green, B. Kveton, J., Lucchina, L. A., Prutkin, J. M., Snyder, D. J. and Tie, K. 1999. Sensory variability, food preferences and BMI in non-medium and supertasters of PROP. Appetite, 33, 228–229.

    Google Scholar 

  • Basker, D. 1988. Critical values of differences among rank sums for multiple comparisons. Food Technology, 42(2), 79, 80–84.

    Google Scholar 

  • Baten, W. D. 1946. Organoleptic tests pertaining to apples and pears. Food Research, 11, 84–94.

    CAS  Google Scholar 

  • Bendig, A. W. and Hughes, J. B. 1953. Effect of number of verbal anchoring and number of rating scale categories upon transmitted information. Journal of Experimental Psychology, 46(2), 87–90.

    CAS  Google Scholar 

  • Bi, J. 2006. Sensory Discrimination Tests and Measurement. Blackwell, Ames, IA.

    Google Scholar 

  • Birch, L. L., Zimmerman, S. I. and Hind, H. 1980. The influence of social-affective context on the formation of children’s food preferences. Child Development, 51, 865–861.

    Google Scholar 

  • Birch, L. L., Birch, D., Marlin, D. W. and Kramer, L. 1982. Effects of instrumental consumption on children’s food preferences. Appetite, 3, 125–143.

    CAS  Google Scholar 

  • Birnbaum, M. H. 1982. Problems with so-called “direct” scaling. In: J. T. Kuznicki, R. A. Johnson and A. F. Rutkiewic (eds.), Selected Sensory Methods: Problems and Approaches to Hedonics. American Society for Testing and Materials, Philadelphia, pp. 34–48.

    Google Scholar 

  • Borg, G. 1982. A category scale with ratio properties for intermodal and interindividual comparisons. In: H.-G. Geissler and P. Pextod (Eds.), Psychophysical Judgment and the Process of Perception. VEB Deutscher Verlag der Wissenschaften, Berlin, pp. 25–34.

    Google Scholar 

  • Borg, G. 1990. Psychophysical scaling with applications in physical work and the perception of exertion. Scandinavian Journal of Work and Environmental Health, 16, 55–58.

    Google Scholar 

  • Boring, E. G. 1942. Sensation and Perception in the History of Experimental Psychology. Appleton-Century-Crofts, New York.

    Google Scholar 

  • Brandt, M. A., Skinner, E. Z. and Coleman, J. A. 1963. The texture profile method. Journal of Food Science, 28, 404–409.

    Google Scholar 

  • Butler, G., Poste, L. M., Wolynetz, M. S., Agar, V. E. and Larmond, E. 1987. Alternative analyses of magnitude estimation data. Journal of Sensory Studies, 2, 243–257.

    Google Scholar 

  • Cardello, A. V. and Schutz, H. G. 2004. Research note. Numerical scale-point locations for constructing the LAM (Labeled affective magnitude) scale. Journal of Sensory Studies, 19, 341–346.

    Google Scholar 

  • Cardello, A. V., Lawless, H. T. and Schutz, H. G. 2008. Effects of extreme anchors and interior label spacing on labeled magnitude scales. Food Quality and Preference, 21, 323–334.

    Google Scholar 

  • Cardello, A. V., Winterhaler, C. and Schutz, H. G. 2003. Predicting the handle and comfort of military clothing fabrics from sensory and instrumental data: Development and application of new psychophysical methods. Textile Research Journal, 73, 221–237.

    CAS  Google Scholar 

  • Cardello, A. V., Schutz, H. G., Lesher, L. L. and Merrill, E. 2005. Development and testing of a labeled magnitude scale of perceived satiety. Appetite, 44, 1–13.

    Google Scholar 

  • Caul, J. F. 1957. The profile method of flavor analysis. Advances in Food Research, 7, 1–40.

    CAS  Google Scholar 

  • Chambers, E. C. and Wolf, M. B. 1996. Sensory Testing Methods. ASTM Manual Series, MNL 26. ASTM International, West Conshohocken, PA.

    Google Scholar 

  • Chen, A. W., Resurreccion, A. V. A. and Paguio, L. P. 1996. Age appropriate hedonic scales to measure the food preferences of young children. Journal of Sensory Studies, 11, 141–163.

    Google Scholar 

  • Chung, S.-J. and Vickers, 2007a. Long-term acceptability and choice of teas differing in sweetness. Food Quality and Preference 18, 963–974.

    Google Scholar 

  • Chung, S.-J. and Vickers, 2007b. Influence of sweetness on the sensory-specific satiety and long-term acceptability of tea. Food Quality and Preference, 18, 256–267.

    Google Scholar 

  • Coetzee, H. and Taylor, J. R. N. 1996. The use and adaptation of the paired comparison method in the sensory evaluation of hamburger-type patties by illiterate/semi-literate consumers. Food Quality and Preference, 7, 81–85.

    Google Scholar 

  • Collins, A. A. and Gescheider, G. A. 1989. The measurement of loudness in individual children and adults by absolute magnitude estimation and cross modality matching. Journal of the Acoustical Society of America, 85, 2012–2021.

    CAS  Google Scholar 

  • Conner, M. T. and Booth, D. A. 1988. Preferred sweetness of a lime drink and preference for sweet over non-sweet foods. Related to sex and reported age and body weight. Appetite, 10, 25–35.

    CAS  Google Scholar 

  • Cordinnier, S. M. and Delwiche, J. F. 2008. An alternative method for assessing liking: Positional relative rating versus the 9-point hedonic scale. Journal of Sensory Studies, 23, 284–292.

    Google Scholar 

  • Cox, E. P. 1980. The optimal number of response alternatives for a scale: A review. Journal of Marketing Research, 18, 407–422.

    Google Scholar 

  • Curtis, D. W., Attneave, F. and Harrington, T. L. 1968. A test of a two-stage model of magnitude estimation. Perception and Psychophysics, 3, 25–31.

    Google Scholar 

  • Edwards, A. L. 1952. The scaling of stimuli by the method of successive intervals. Journal of Applied Psychology, 36, 118–122.

    Google Scholar 

  • Ekman, G. 1964. Is the power law a special case of Fechner’s law? Perceptual and Motor Skills, 19, 730.

    CAS  Google Scholar 

  • Einstein, M. A. 1976. Use of linear rating scales for the evaluation of beer flavor by consumers. Journal of Food Science, 41, 383–385.

    Google Scholar 

  • El Dine, A. N. and Olabi, A. 2009. Effect of reference foods in repeated acceptability tests: Testing familiar and novel foods using 2 acceptability scales. Journal of Food Science, 74, S97–S105.

    CAS  Google Scholar 

  • Engen, T. 1974. Method and theory in the study of odor preferences. In: A. Turk, J. W. Johnson and D. G. Moulton (Eds.), Human Responses to Environmental Odors. Academic, New York.

    Google Scholar 

  • Finn, A. and Louviere, J. J. 1992. Determining the appropriate response to evidence of public concern: The case of food safety. Journal of Public Policy and Marketing, 11, 12–25.

    Google Scholar 

  • Forde, C. G. and Delahunty, C. M. 2004. Understanding the role cross-modal sensory interactions play in food acceptability in younger and older consumers. Food Quality and Preference, 15, 715–727.

    Google Scholar 

  • Frijters, J. E. R., Kooistra, A. and Vereijken, P. F. G. 1980. Tables of d’ for the triangular method and the 3-AFC signal detection procedure. Perception and Psychophysics, 27, 176–178.

    Google Scholar 

  • Gaito, J. 1980. Measurement scales and statistics: Resurgence of an old misconception. Psychological Bulletin, 87, 564–587.

    Google Scholar 

  • Gay, C., and Mead, R. 1992 A statistical appraisal of the problem of sensory measurement. Journal of Sensory Studies, 7, 205–228.

    Google Scholar 

  • Gent, J. F. and Bartoshuk, L. M. 1983. Sweetness of sucrose, neohesperidin dihydrochalcone and sacchar in is related to genetic ability to taste the bitter substance 6-n-propylthiouracil. Chemical Senses, 7, 265–272.

    CAS  Google Scholar 

  • Gescheider, G. A. 1988. Psychophysical scaling. Annual Review of Psychology, 39, 169–200.

    CAS  Google Scholar 

  • Giovanni, M. E. and Pangborn, R. M. 1983. Measurement of taste intensity and degree of liking of beverages by graphic scaling and magnitude estimation. Journal of Food Science, 48, 1175–1182.

    Google Scholar 

  • Gracely, R. H., McGrath, P. and Dubner, R. 1978a. Ratio scales of sensory and affective verbal-pain descriptors. Pain, 5, 5–18.

    CAS  Google Scholar 

  • Gracely, R. H., McGrath, P. and Dubner, R. 1978b. Validity and sensitivity of ratio scales of sensory and affective verbal-pain descriptors: Manipulation of affect by Diazepam. Pain, 5, 19–29.

    CAS  Google Scholar 

  • Green, B. G., Shaffer, G. S. and Gilmore, M. M. 1993. Derivation and evaluation of a semantic scale of oral sensation magnitude with apparent ratio properties. Chemical Senses, 18, 683–702.

    Google Scholar 

  • Green, B. G., Dalton, P., Cowart, B., Shaffer, G., Rankin, K. and Higgins, J. 1996. Evaluating the “Labeled Magnitude Scale” for measuring sensations of taste and smell. Chemical Senses, 21, 323–334.

    CAS  Google Scholar 

  • Greene, J. L., Bratka, K. J., Drake, M. A. and Sanders, T. H. 2006. Effective of category and line scales to characterize consumer perception of fruity fermented flavors in peanuts. Journal of Sensory Studies, 21, 146–154.

    Google Scholar 

  • Guest, S., Essick, G., Patel, A., Prajpati, R. and McGlone, F. 2007. Labeled magnitude scales for oral sensations of wetness, dryness, pleasantness and unpleasantness. Food Quality and Preference, 18, 342–352.

    Google Scholar 

  • Hein, K. A., Jaeger, S. R., Carr, B. T. and Delahunty, C. M. 2008. Comparison of five common acceptance and preference methods. Food Quality and Preference, 19, 651–661.

    Google Scholar 

  • Huskisson, E. C. 1983. Visual analogue scales. In: R. Melzack (Ed.), Pain Measurement and Assessment. Raven, New York, pp. 34–37.

    Google Scholar 

  • Jaeger, S. R.; Jørgensen, A. S., AAslying, M. D. and Bredie, W. L. P. 2008. Best-worst scaling: An introduction and initial comparison with monadic rating for preference elicitation with food products. Food Quality and Preference, 19, 579–588.

    Google Scholar 

  • Jaeger, S. R. and Cardello, A. V. 2009. Direct and indirect hedonic scaling methods: A comparison of the labeled affective magnitude (LAM) scale and best-worst scaling. Food Quality and Preference, 20, 249–258.

    Google Scholar 

  • Jones, F. N. 1974. History of psychophysics and judgment. In: E. C. Carterette and M. P. Friedman (Eds.), Handbook of Perception. Psychophysical Judgment and Measurement, Vol. 2. Academic, New York, pp. 11–22.

    Google Scholar 

  • Jones, L. V. and Thurstone, L. L. 1955. The psychophysics of semantics: An experimental investigation. Journal of Applied Psychology, 39, 31–36.

    Google Scholar 

  • Jones, L. V., Peryam, D. R. and Thurstone, L. L. 1955. Development of a scale for measuring soldier’s food preferences. Food Research, 20, 512–520.

    Google Scholar 

  • Keskitalo, K. Knaapila, A., Kallela, M., Palotie, A., Wessman, M., Sammalisto, S., Peltonen, L., Tuorila, H. and Perola, M. 2007. Sweet taste preference are partly genetically determined: Identification of a trait locus on Chromosome 161–3. American Journal of Clinical Nutrition, 86, 55–63.

    CAS  Google Scholar 

  • Kim, K.-O. and O’Mahony, M. 1998. A new approach to category scales of intensity I: Traditional versus rank-rating. Journal of Sensory Studies, 13, 241–249.

    Google Scholar 

  • King, B. M. 1986. Odor intensity measured by an audio method. Journal of Food Science, 51, 1340–1344.

    Google Scholar 

  • Koo, T.-Y., Kim, K.-O., and O’Mahony, M. 2002. Effects of forgetting on performance on various intensity scaling protocols: Magnitude estimation and labeled magnitude scale (Green scale). Journal of Sensory Studies, 17, 177–192.

    Google Scholar 

  • Kroll, B. J. 1990. Evaluating rating scales for sensory testing with children. Food Technology, 44(11), 78–80, 82, 84, 86.

    Google Scholar 

  • Kurtz, D. B., White, T. L. and Hayes, M. 2000. The labeled dissimilarity scale: A metric of perceptual dissimilarity. Perception and Psychophysics, 62, 152–161.

    CAS  Google Scholar 

  • Land, D. G. and Shepard, R. 1984. Scaling and ranking methods. In: J. R. Piggott (ed.), Sensory Analysis of Foods. Elsevier Applied Science, London, pp. 141–177.

    Google Scholar 

  • Lane, H. L., Catania, A. C. and Stevens, S. S. 1961. Voice level: Autophonic scale, perceived loudness and effect of side tone. Journal of the Acoustical Society of America, 33, 160–167.

    Google Scholar 

  • Larson-Powers, N. and Pangborn, R. M. 1978. Descriptive analysis of the sensory properties of beverages and gelatins containing sucrose or synthetic sweeteners. Journal of Food Science, 43, 47–51.

    CAS  Google Scholar 

  • Lawless, H. T. 1977. The pleasantness of mixtures in taste and olfaction. Sensory Processes, 1, 227–237.

    CAS  Google Scholar 

  • Lawless, H. T. 1989. Logarithmic transformation of magnitude estimation data and comparisons of scaling methods. Journal of Sensory Studies, 4, 75–86.

    Google Scholar 

  • Lawless, H. T. and Clark, C. C. 1992. Psychological biases in time intensity scaling. Food Technology, 46, 81, 84–86, 90.

    Google Scholar 

  • Lawless, H. T. and Malone, J. G. 1986a. The discriminative efficiency of common scaling methods. Journal of Sensory Studies, 1, 85–96.

    Google Scholar 

  • Lawless, H. T. and Malone, G. J. 1986b. A comparison of scaling methods: Sensitivity, replicates and relative measurement. Journal of Sensory Studies, 1, 155–174.

    Google Scholar 

  • Lawless, H. T. and Skinner, E. Z. 1979. The duration and perceived intensity of sucrose taste. Perception and Psychophysics, 25, 249–258.

    Google Scholar 

  • Lawless, H. T., Popper, R. and Kroll, B. J. 2010a. Comparison of the labeled affective magnitude (LAM) scale, an 11-point category scale and the traditional nine-point hedonic scale. Food Quality and Preference, 21, 4–12.

    Google Scholar 

  • Lawless, H. T., Sinopoli, D. and Chapman, K. W. 2010b. A comparison of the labeled affective magnitude scale and the nine point hedonic scale and examination of categorical behavior. Journal of Sensory Studies, 25, S1, 54–66.

    Google Scholar 

  • Lawless, H. T., Cardello, A. V., Chapman, K. W., Lesher, L. L., Given, Z. and Schutz, H. G. 2010c. A comparison of the effectiveness of hedonic scales and end-anchor compression effects. Journal of Sensory Studies, 28, S1, 18–34.

    Google Scholar 

  • Lee, H.-J., Kim, K.-O., and O’Mahony, M. 2001. Effects of forgetting on various protocols for category and line scales of intensity. Journal of Sensory Studies, 327–342.

    Google Scholar 

  • Likert, R. 1932. Technique for the measurement of attitudes. Archives of Psychology, 140, 1–55.

    Google Scholar 

  • Lindvall, T. and Svensson, L. T. 1974. Equal unpleasantness matching of malodourous substances in the community. Journal of Applied Psychology, 59, 264–269.

    Google Scholar 

  • Mahoney, C. H., Stier, H. L. and Crosby, E. A. 1957. Evaluating flavor differences in canned foods. II. Fundamentals of the simplified procedure. Food Technology 11, Supplemental Symposium Proceedings, 37–42.

    Google Scholar 

  • Marks, L. E. 1978. Binaural summation of the loudness of pure tones. Journal of the Acoustical Society of America, 64, 107–113.

    CAS  Google Scholar 

  • Marks, L. E., Borg, G. and Ljunggren, G. 1983. Individual differences in perceived exertion assessed by two new methods. Perception and Psychophysic, 34, 280–288.

    CAS  Google Scholar 

  • Marks, L. E., Borg, G. and Westerlund, J. 1992. Differences in taste perception assessed by magnitude matching and by category-ratio scaling. Chemical Senses, 17, 493–506.

    Google Scholar 

  • Mattes, R. D. and Lawless, H. T. 1985. An adjustment error in optimization of taste intensity. Appetite, 6, 103–114.

    CAS  Google Scholar 

  • McBride, R. L. 1983a. A JND-scale/category scale convergence in taste. Perception and Psychophysics, 34, 77–83.

    CAS  Google Scholar 

  • McBride, R. L. 1983b. Taste intensity and the case of exponents greater than 1. Australian Journal of Psychology, 35, 175–184.

    Google Scholar 

  • McBurney, D. H. and Shick, T. R. 1971. Taste and water taste for 26 compounds in man. Perception and Psychophysics, 10, 249–252.

    Google Scholar 

  • McBurney, D. H. and Bartoshuk, L. M. 1973. Interactions between stimuli with different taste qualities. Physiology and Behavior, 10, 1101–1106.

    CAS  Google Scholar 

  • McBurney, D. H., Smith, D. V. and Shick, T. R. 1972. Gustatory cross-adaptation: Sourness and bitterness. Perception and Psychophysics, 11, 228–232.

    Google Scholar 

  • Mead, R. and Gay, C. 1995. Sequential design of sensory trials. Food Quality and Preference, 6, 271–280.

    Google Scholar 

  • Mecredy, J. M. Sonnemann, J. C. and Lehmann, S. J. 1974. Sensory profiling of beer by a modified QDA method. Food Technology, 28, 36–41.

    Google Scholar 

  • Meilgaard, M., Civille, G. V. and Carr, B. T. 2006. Sensory Evaluation Techniques, Fourth Edition. CRC, Boca Raton, FL.

    Google Scholar 

  • Moore, L. J. and Shoemaker, C. F. 1981. Sensory textural properties of stabilized ice cream. Journal of Food Science, 46, 399–402.

    CAS  Google Scholar 

  • Moskowitz, H. R. 1971. The sweetness and pleasantness of sugars. American Journal of Psychology, 84, 387–405.

    CAS  Google Scholar 

  • Moskowitz, H. R. and Sidel, J. L. 1971. Magnitude and hedonic scales of food acceptability. Journal of Food Science, 36, 677–680.

    Google Scholar 

  • Muñoz, A. M. and Civille, G. V. 1998. Universal, product and attribute specific scaling and the development of common lexicons in descriptive analysis. Journal of Sensory Studies, 13, 57–75.

    Google Scholar 

  • Newell, G. J. and MacFarlane, J. D. 1987. Expanded tables for multiple comparison procedures in the analysis of ranked data. Journal of Food Science, 52, 1721–1725.

    Google Scholar 

  • Olabi, A. and Lawless, H. T. 2008. Persistence of context effects with training and reference standards. Journal of Food Science, 73, S185–S189.

    CAS  Google Scholar 

  • O’Mahony, M., Park, H., Park, J. Y. and Kim, K.-O. 2004. Comparison of the statistical analysis of hedonic data using analysis of variance and multiple comparisons versus and R-index analysis of the ranked data. Journal of Sensory Studies, 19, 519–529.

    Google Scholar 

  • Pangborn, R. M. and Dunkley, W. L. 1964. Laboratory procedures for evaluating the sensory properties of milk. Dairy Science Abstracts, 26–55–62.

    Google Scholar 

  • Parducci, A. 1965. Category judgment: A range-frequency model. Psychological Review, 72, 407–418.

    CAS  Google Scholar 

  • Park, J.-Y., Jeon, S.-Y., O’Mahony, M. and Kim, K.-O. 2004. Induction of scaling errors. Journal of Sensory Studies, 19, 261–271.

    Google Scholar 

  • Pearce, J. H., Korth, B. and Warren, C. B. 1986. Evaluation of three scaling methods for hedonics. Journal of Sensory Studies, 1, 27–46.

    Google Scholar 

  • Peryam. D. 1989. Reflections. In: Sensory Evaluation. In Celebration of our Beginnings. American Society for Testing and Materials, Philadelphia, pp. 21–30.

    Google Scholar 

  • Peryam, D. R. and Girardot, N. F. 1952. Advanced taste-test method. Food Engineering, 24, 58–61, 194.

    Google Scholar 

  • Piggot, J. R. and Harper, R. 1975. Ratio scales and category scales for odour intensity. Chemical Senses and Flavour, 1, 307–316.

    Google Scholar 

  • Pokorńy, J., Davídek, J., Prnka, V. and Davídková, E. 1986. Nonparametric evaluation of graphical sensory profiles for the analysis of carbonated beverages. Die Nahrung, 30, 131–139.

    Google Scholar 

  • Poulton, E. C. 1989. Bias in Quantifying Judgments. Lawrence Erlbaum, Hillsdale, NJ.

    Google Scholar 

  • Richardson, L. F. and Ross, J. S. 1930. Loudness and telephone current. Journal of General Psychology, 3, 288–306.

    Google Scholar 

  • Rosenthal, R. 1987. Judgment Studies: Design, Analysis and Meta-Analysis. University Press, Cambridge.

    Google Scholar 

  • Shand, P. J., Hawrysh, Z. J., Hardin, R. T. and Jeremiah, L. E. 1985. Descriptive sensory analysis of beef steaks by category scaling, line scaling and magnitude estimation. Journal of Food Science, 50, 495–499.

    Google Scholar 

  • Schutz, H. G. and Cardello, A. V. 2001.. A labeled affective magnitude (LAM) scale for assessing food liking/disliking. Journal of Sensory Studies, 16, 117–159.

    Google Scholar 

  • Siegel, S. 1956. Nonparametric Statistics for the Behavioral Sciences. McGraw-Hill, New York.

    Google Scholar 

  • Sriwatanakul, K., Kelvie, W., Lasagna, L., Calimlim, J. F., Wels, O. F. and Mehta, G. 1983. Studies with different types of visual analog scales for measurement of pain. Clinical Pharmacology and Therapeutics, 34, 234–239.

    CAS  Google Scholar 

  • Stevens, J. C. and Marks, L. M. 1980. Cross-modality matching functions generated by magnitude estimation. Perception and Psychophysics, 27, 379–389.

    CAS  Google Scholar 

  • Stevens, S. S. 1951. Mathematics, measurement and psychophysics. In: S. S. Stevens (ed.), Handbook of Experimental Psychology. Wiley, New York, pp. 1–49.

    Google Scholar 

  • Stevens, S. S. 1956. The direct estimation of sensory magnitudes—loudness. American Journal of Psychology, 69, 1–25.

    CAS  Google Scholar 

  • Stevens, S. S. 1957. On the psychophysical law. Psychological Review, 64, 153–181.

    CAS  Google Scholar 

  • Stevens, S. S. 1969. On predicting exponents for cross-modality matches. Perception and Psychophysics, 6, 251–256.

    Google Scholar 

  • Stevens, S. S. and Galanter, E. H. 1957. Ratio scales and category scales for a dozen perceptual continua. Journal of Experimental Psychology, 54, 377–411.

    CAS  Google Scholar 

  • Stoer, N. L. and Lawless, H. T. 1993. Comparison of single product scaling and relative-to-reference scaling in sensory evaluation of dairy products. Journal of Sensory Studies, 8, 257–270.

    Google Scholar 

  • Stone, H., Sidel, J., Oliver, S., Woolsey, A. and Singleton, R. C. 1974. Sensory Evaluation by quantitative descriptive analysis. Food Technology, 28, 24–29, 32, 34.

    Google Scholar 

  • Teghtsoonian, M. 1980. Children’s scales of length and loudness: A developmental application of cross-modal matching. Journal of Experimental Child Psychology, 30, 290–307.

    CAS  Google Scholar 

  • Thurstone, L. L. 1927. A law of comparative judgment. Psychological Review, 34, 273–286.

    Google Scholar 

  • Townsend, J. T. and Ashby, F. G. 1980. Measurement scales and statistics: The misconception misconceived. Psychological Bulletin, 96, 394–401.

    Google Scholar 

  • Vickers, Z. M. 1983. Magnitude estimation vs. category scaling of the hedonic quality of food sounds. Journal of Food Science, 48, 1183–1186.

    Google Scholar 

  • Villanueva, N. D. M. and Da Silva, M. A. A. P. 2009. Performance of the nine-point hedonic, hybrid and self-adjusting scales in the generation of internal preference maps. Food Quality and Preference, 20, 1–12.

    Google Scholar 

  • Villanueva, N. D. M., Petenate, A. J., and Da Silva, M. A. A. P. 2005. Comparative performance of the hybrid hedonic scale as compared to the traditional hedonic, self-adjusting and ranking scales. Food Quality and Preference, 16, 691–703.

    Google Scholar 

  • Ward, L. M. 1986. Mixed-modality psychophysical scaling: Double cross-modality matching for “difficult” continua. Perception and Psychophysics, 39, 407–417.

    CAS  Google Scholar 

  • Weiss, D. J. 1972. Averaging: an empirical validity criterion for magnitude estimation. Perception and Psychophysics, 12, 385–388.

    Google Scholar 

  • Winakor, G., Kim, C. J. and Wolins, L. 1980. Fabric hand: Tactile sensory assessment. Textile Research Journal, 50, 601–610.

    Google Scholar 

  • Yamaguchi, S. 1967. The synergistic effect of monosodium glutamate and disodium 5 inosinate. Journal of Food Science 32, 473–477.

    CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Appendices

Appendix 1: Derivation of Thurstonian-Scale Values for the 9-Point Scale

The choice of adjective words for the 9-point hedonic scale is a good example of how carefully a scale can be constructed. The long-standing track record of this tool demonstrates its utility and wide applicability in consumer testing. However, few sensory practitioners actually know how the adjectives were found and what criteria were brought to bear in selecting these descriptors (slightly, moderately, very much, and extremely like/dislike) from a larger pool of possible words. The goal of this section is to provide a shorthand description of the criteria and mathematical method used to select the words for this scale.

One concern was the degree to which the term had consensual meaning in the population. The most serious concern was when a candidate word had an ambiguous or double meaning across the population. For example, the word “average” suggests an intermediate response to some people, but in the original study by Jones and Thurstone (1955) there were a group of people who equated it with “like moderately” perhaps since an average product in those days was one that people would like. These days, one can think of negative connotations to the word “average” as in “he was only an average student.” Other ambiguous or bimodal terms were “like not so much” and “like not so well.” Ideally, a term should have low variability in meaning, i.e., a low standard deviation, no bimodality, and little skew. Part of this concern with the normality of the distribution of psychological reactions to a word was the fact that the developers used Thurstone’s model for categorical judgment as a means of measuring the psychological-scale values for the words. This model is at its most simple form when the items to be scaled show normal distributions of equal variance.

Which leads us to the numerical method. Jones and Thurstone modified a procedure used earlier by Edwards (1952). A description of the process and results can be found in the paper “Development of a scale for measuring soldiers’ food preferences” by Jones et al. (1955). Fifty-one words and phrases formed the candidate list based on a pilot study with 900 soldiers chosen to be a representative sample of enlisted personnel. Each phrase was presented on a form with a rating scale from –4 to +4 with a check off format. In other words, each person read each phrase and assigned in an integer value from –4 to +4 (including zero as an option). This method would seem to presume that these integers were themselves an interval scale of psychological magnitude, an assumption that to our knowledge has never been questioned.

Of course, the mean scale values could now be assigned on a simple and direct basis, but the Thurstonian methods do not use the raw numbers as the scale, but transform them to use standard deviations as the units of measurement. So the scale needs to be converted to Z-score values. The exact steps are as follows:

  1. 1.

    Accumulate frequency counts for all the tested words across the –4 to + 4 scale. Think of these categories as little “buckets” into which judgments have been tossed.

  2. 2.

    Find the marginal proportions each value from –4 to +4 (summed across all test items). Add up the proportions from lowest to highest to get a cumulative proportion for each bucket.

  3. 3.

    Convert these proportions to z-scores in order to re-scale the boundaries for the original –4 to +4 cutoffs. Let us call these the “category z-values” for each of the “buckets.” The top bucket will have a value of 100%, so it will have no z-score (undefined/infinite).

  4. 4.

    Next examine each individual item. Sum its individual proportions across the categories, from where it is first used until 100% of the responses are accumulated.

  5. 5.

    Convert the proportions for the item to Z-scores. Alternatively, you can plot these proportions on “cumulative probability paper,” a graphing format that marks the ordinate in equal standard deviations units according to the cumulative normal distribution. Either of these methods will tend to make the cumulative S-shaped curve for the item into a straight line. The X-axis value for each point is the “category z-value” for that bucket.

  6. 6.

    Fit a line to the data and interpolate the 50% point on the X-axis (the re-scaled category boundary estimates). These interpolated values for the median for each item now form the new scale values for the items.

An example of this interpolation is shown in Fig. 7.7. Three of the phrases used in the original scaling study of Jones and Thurstone (1955) are pictured, three that were not actually chosen but for which we have approximate proportions and z-scores from their figures. The small vertical arrows on the X-axis show the scale values for the original categories of –4 to +3 (+4 has cumulative proportion of 100% and thus the z-score is infinite). Table 7.1 gives the values and proportions for each phrase and the original categories. The dashed vertical lines dropped from the intersection at the zero z-score (50% point) show the approximate mean values interpolated on the X-axis (i.e., about –1.1 for “do not care for it” and about +2.1 for “preferred.”). Note that “preferred” and “don’t care for it” have a linear fit and steep slope, suggesting a normal distribution and low standard deviation. In contrast, “highly unfavorable” has a lower slope and some curvilinearity, indicative of higher variability, skew, and/or pockets of disagreement about the connotation of this term.

Fig. 7.7
figure 7figure 7

An illustration of the method used to establish spacings and scale values for the 9-point hedonic scale using Thurstonian theory. Arrows on the X-axis show the scale points for the z-scores based on the complete distribution of the original –4 to +4 ratings. The Y-axis shows the actual z-scores based on the proportion of respondents using that category for each specific term. Re-plotted from data provided in Jones et al. (1955).

Table 7.1 Examples of scaled phrases used in Fig. 7.7

The actual scale values for the original adjectives are shown in Table 7.2, as found with a soldier population circa 1950 (Jones et al., 1955). You may note that the words are not equally spaced, and that the “slightly” values are closer to the neutral point than some of the other intervals, and the extreme points are a little farther out. This bears a good deal of similarity to the intervals found with the LAM scale as shown in the column where the LAM values are re-scaled to the same range as the 9-point Thurstonian Values.

Table 7.2 Actual 9-point scale phrase values and comparison to the LAM values

Appendix 2: Construction of Labeled Magnitude Scales

There are two primary methods for constructing labeled magnitude scales and they are very similar. Both require magnitude estimates from the participants to scale the word phrases used on the lines. In one case, just the word phrases are scaled, and in the second method, the word phrases are scaled among a list of common everyday experiences or sensations that most people are familiar with. The values obtained by the simple scaling of just the words will depend upon the words that are chosen, and extremely high examples (e.g., greatest imaginable liking for any experience) will tend to compress the values of the interior phrases (Cardello et al., 2008). Whether this kind of context effect will occur for the more general method of scaling amongst common experiences is not known. But the use of a broad frame of reference could be a stabilizing factor.

Here is an example of the instructions given to subjects in construction of a labeled affective magnitude scale. Note that for hedonics, which are a bipolar continuum with a neutral point, it is necessary to collect a tone or valence (plus or minus) value as well as the overall “intensity” rating.

Next to each word label a response area appeared similar to this:

Phrase: Tone: + – 0 How much:

Like extremely __________ _______

Words or phrases are presented in random order. After reading a word they must decide whether the word is positive, negative or neutral and place the corresponding symbol on the first line. If the hedonic tone was not a neutral one (zero value), they are instructed to give a numerical estimate using modulus-free magnitude estimation. The following is a sample of the instructions taken from Cardello et al. (2008):

After having determined whether the phrase is positive or negative or neutral and writing the appropriate symbol (+, –, 0) on the first line, you will then assess the strength or magnitude of the liking or disliking reflected by the phrase. You will do this by placing a number on the second blank line (under “How Much”). For the first phrase that you rate, you can write any number you want on the line. We suggest you do not use a small number for this word/phrase. The reason for this is that subsequent words/phrases may reflect much lower levels of liking or disliking. Aside from this restriction you can use any numbers you want. For each subsequent word/phrase your numerical judgment should be made proportionally and in comparison to the first number. That is, if you assigned the number 800 to index the strength of the liking/disliking denoted by the first word/phrase and the strength of liking/disliking denoted by the second word/phrase were twice as great, you would assign the number 1,600. If it were three times as great you would assign the number 2,400, etc. Similarly, if the second word/phrase denoted only 1/10 the magnitude of liking as the first, you would assign it the number 80 and so forth. If any word/phrase is judged to be “neutral” (zero (0) on the first line) it should also be given a zero for its magnitude rating.

In the cased of Cardello et al. (2008), positive and negative word labels were analyzed separately. Raw magnitude estimates were equalized for scale range using the procedure of Lane et al. (1961). All positive and negative magnitude estimates for a given subject were multiplied by an individual scaling factor. This factor was equal to the ratio of the grand geometric mean (of the absolute value of all nonzero ratings) across all subjects divided by the geometric mean for that subject. The geometric mean magnitude estimates for each phrase were then calculated based on this range-equated data. These means became the distance from the zero point for placement of the phrases along the scale, usually accompanied by a short cross-hatch mark at that point.

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Lawless, H., Heymann, H. (2010). Scaling. In: Sensory Evaluation of Food. Food Science Text Series. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-6488-5_7

Download citation

Publish with us

Policies and ethics