Evaluating Comparability of Survey Data on Subjective Well-being

Kristoffersen, Ingebjørg

doi:10.1007/978-3-319-61810-4_8

Ingebjørg Kristoffersen⁴

Part of the book series: Happiness Studies Book Series ((HAPS))

722 Accesses

Abstract

This chapter examines the problem of comparability in the context of microeconomic survey data, focussing particularly on the commonly used 0–10 numeric response scale. Most of the discussions of comparability presented in the literature concerns interpersonal (across-individual) comparability. However, the increasing availability of panel data implies a need for a discussion also of intertemporal (within-individual) comparability. This chapter provides a discussion of the nature, causes and consequences of comparability issues in subjective well-being data, and an overview of possible approaches to this problem. Finally, some worked examples and empirical evidence are presented, using Australian data. These results support the assumption that the eleven-point numeric life satisfaction scale yields scores which are ordinally distinct both across and within individuals, and that the assumption of equidistance across the scale (and therefore of cardinal comparability) seems reasonable.

The content of this chapter draws on prior work published in The Economic Record, 2010, Vol 86(272), pp 98–123, under the title The Metrics of Well-being: Cardinality, Neutrality and Additivity; and also work published in Social Indicators Research, 2017, Vol 130(2), pp 845–865, under the title Metrics of Subjective Well-being Data: An Empirical Evaluation of the Ordinal and Cardinal Comparability of Life Satisfaction Scores. This chapter includes empirical analyses based on unit record data from the Household, Income and Labour Dynamics in Australia (HILDA) survey. The HILDA project was initiated and funded by the Australian Government Department of Families, Housing, Community Services and Indigenous Affairs (FaHCSIA) and is managed by the Melbourne Institute of Applied Economic and Social Research (MIAESR). The findings and views reported in this chapter, as well as any mistakes or errors, are those of the author, and should not be attributed to FaHCSIA or MIAESR.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
For an excellent discussion of Edgeworth’s (1881 [1961]) work and its relevance to contemporary measurement of subjective well-being, see Colander (2007). Bruni and Sugden (2007) also present a comprehensive history of how economics have approached (and avoided) the measurement of well-being.
2.
In other words, such measurement scales exhibit interval-level quality. A possible further assumption implies ratio-level quality. Ratio quality requires, in addition to equidistance of score points, that the measurement scale has a non-arbitrary zero-point, or value of neutrality. Ratio-level quality is not usually implied by the ways in which subjective well-being data are used and interpreted in the literature, hence this level of quality is not considered very important in this context. A more in-depth discussion is provided in Kristoffersen (2010).
3.
For a review of the issue of interpersonal and intertemporal comparability of well-being, see, for example, Larsen and Fredrickson (1999). For a comprehensive review on issues to do with international comparability, see Diener and Suh (2000).
4.
First, implicit trade-offs, as measured in empirical models of subjective well-being, generally correspond well with what we know about choice behaviour: for example, the observed positive effects of marriage and employment on subjective well-being correspond well with the amount of effort people tend to put into obtaining these outcomes. Second, observed behaviour is consistent with what we expect from well-being-maximising individuals: for example, low satisfaction scores in the spheres of work and marriage tend to be good predictors of job change and divorce. Finally, the evidence which emerges from the analysis of survey data on subjective well-being corresponds well with that which emerges from experimental economics, particularly with respect to positional concerns (Clark et al., 2008).
5.
This basic model, and the notation used, follows Blanchflower and Oswald (2004).
6.
‘True’ well-being or utility might be interpreted as the individual’s actual experience, or whatever the social scientist is trying to measure and understand, similarly to how other psychological concepts such as intelligence and personality traits are measured.
7.
The response functions illustrated in Fig. 1 are fitted to a numeric scale, but could easily be modified to fit a scale consisting of ordered verbal responses. Note that the focus here is not comparison across different types of survey instruments (and thus different measurement scales) but rather differences within individuals’ perceptions of the same measurement scale. For convenience, the second diagram of Fig. 1 assumes a linear response function. Other functional forms are discussed in turn and can easily be considered with similar implications.
8.
Relatedly, set-point theory asserts that while individuals’ subjective well-being can vary in the short term, reacting to various events that occur in their lives, they tend to revert back to given baseline level of subjective well-being over time (Headey, 2007; Lucas, 2007). Thus, each individual has some internal set-point level which might be largely determined by genetics.
9.
There are some recent techniques to correct difference in the individual level of well-being, for example the vignettes method (Kapteyn, Smith, & van Soerst, 2007).
10.
This paradox originates in Easterlin’s (1974) seminal paper where he demonstrates that US citizens’ levels of happiness have largely remained unchanged since the Second World War, despite living standards having improved dramatically during the post-war years. A good discussion of the Easterlin paradox is provided in Clark et al. (2008).
11.
The same may be said for comparison within individuals across time. That is, a person may select a score of 9 one year, and also the next year, despite actually being more satisfied, due to changes in perceptions as to what is possible. Changes in reference points may occur through key life events, such as romantic relationships (which might extend the scale of what levels of happiness and sadness are possible) and bereavement.
12.
For a discussion on survey design and approaches to measuring well-being, see for example Conti and Pudney (2008).
13.
Some evidence of such effect in subjective well-being data are provided by Lau (2007).
14.
For a brief general discussion on extreme response, see Larsen (1999). More specific evidence is presented by Brulé and Veenhoven (2017), who specifically examine individuals’ propensity for scoring 10 on a 0–10 subjective well-being scale. Lau (2007) also asks respondents to recall a situation in their lives where they felt extremely good and to give a well-being score for that particular situation. If a respondent did not choose the highest score, they were subsequently asked why they did not do so. The most common reasons for not choosing the highest score given by Australian respondents were ‘did not reach standard of a “10” rating’ (29.2%), ‘a rating of 10 is never attainable’ (38.5%), ‘optimism’ (14.5%) and ‘modesty’ (4.2%).
15.
For example, Blanchflower and Oswald (2004, 2005), Gardner and Oswald (2001) and Headey and Wooden (2004) find that results are robust across these models. Van Praag and Ferrer-i-Carbonell (2004) conclude similarly from their comprehensive collection of analyses.
16.
Specifically, Rasch models apply additive conjoint measurement (Luce & Tukey, 1964) to produce a measure where conjoint transitivity implies that items and persons are measured on an interval scale with a common unit (Brogden, 1977; Wright, 1997). Andrich (1978) later developed the polytomous Rasch model for multiple ordered (rather than dichotomous) responses. See Wright’s (1997) for a brief description of the history and development of measurement in social sciences.
17.
Subjective well-being is sometimes also measured using multi-item scales (Diener, Emmons, Larsen & Griffin, 1985), or as aggregates of multidimensional measurement of subjective well-being. However, the focus here remains on single-item numeric scales.
18.
As explained by Luce and Tukey (1964): ‘the essential character of simultaneous conjoint measurement is described by an axiomatisation of the comparison of effects of (or responses to) pairs formed from two specific kinds of “quantities”.’ They explain that these can potentially produce a cardinal (interval quality) measure: ‘The axioms apply when, for example, the effect of a pair consisting of one mass and one difference in gravitational potential on a device that responds to momentum is compared with the effect of another such pair. Measurement on interval scales which have a common unit follows from these axioms’.
19.
The set of possible response functions produces illustrated in Fig. 2 produces a limited set of possible shapes of the observable function k, which itself also likely lies somewhere in the spectrum between logistic and logit, depending on the shape of function g. For example, a linear function k can only result from functions g and h taking exactly the same form (with the same strength in curvature). This is because function k is function h transformed by g ⁻¹, so if h and g have the same shape and curvature, function k will be linear. If g and h take opposite forms, then the form of k will be an exaggeration of h. Of course, many other possibilities exist. This is discussed in further detail in Kristoffersen (2017).
20.
This worked example is a simplified and extended version of that which is presented in Kristoffersen (2017).
21.
Note that the comparison between life satisfaction and mental health scores implies an inconsistency with respect to timing. The former is general in nature and has no specific time frame attached to it, while the latter specifically refers to experiences over the past four weeks. When individuals evaluate how satisfied they are with life (or any aspect thereof) this is likely to be some function of past (remembered), current and expected future satisfaction. It is entirely up to the individual how large a time frame they wish to consider, and there are probably also individual differences in the ‘discount rate’ with which distant experiences are weighed compared to proximate ones. While this inconsistency is acknowledged here, it is not considered likely to compromise these results unduly. If so, this would likely manifest more so in random noise than in any conceivable bias.
22.
This assumption may be considered reasonable due to the greater degree of objectivity in how mental health is defined and measured. Subjective well-being and mental health both capture information about true well-being. Unlike life satisfaction, the definition of what constitutes poor or good mental health is defined by the instrument rather than the respondent. Furthermore, this instrument consists of responses to five specific questions. Although there will always be some degree of ambiguity as to the exact interpretation of the five moods and the implied frequencies, these responses imply a much greater degree of specificity. Consequently, one can be reasonably confident that a person who reports a higher MH5 score really does exhibit better mental health, and thereby well-being, than someone who reports a lower MH5 score, by its very definition.
23.
Specifically, the raw MH5 index scores intervals 0–10, 10–20, etc., up to 90–100 have logit intervals of 2.23, 1.22, 1.00, 0.90, 0.85, 0.83, 0.90, 1.09, 1.54 and 3.21 (Perneger & Bovier 2001). Accordingly, the following transformation function will linearise these intervals: \( {\text{MH}}5^{T} = \ln \left( {\frac{{0.00932{\text{MH}}5 + 0.034}}{{1 - (0.00932{\text{MH}}5 + 0.034)}}} \right) \). This produces a scale with lower and upper bounds of −3.35 and +3.35, with a mid-point of zero. For convenience, this is scaled to produce a 0–100 index in the analysis to follow.
24.
This approach is described in further detail in Kristoffersen (2017).
25.
It should be noted that this increase in the model’s explanatory power may be due to the fact that the MH5 and life satisfaction scores are largely subject to the same type of measurement error, in the sense that individuals attribute different meaning to a given measurement scale. This is accounted for in a fixed-effects panel model, in which case explanatory power increases by less, from 0.54 to 0.57, when mental health is added to the standard set of explanatory variables, including physical health.

References

Akerlof, G. A. (1970). The market for “lemons”: Quality uncertainty and the market mechanism. Quarterly Journal of Economics, 84, 488–500.
Article Google Scholar
Akerlof, G. A. (1980). A theory of social custom, of which unemployment may be one consequence. Quarterly Journal of Economics, 94, 749–775.
Article Google Scholar
Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561–573.
Article Google Scholar
Baker, B. O., Hardyck, C. D., & Petrinovich, L. F. (1966). Weak measurements vs. strong statistics: An empirical critique of S.S. Stevens’ proscriptions on statistics. Educational and Psychological Measurement, 26, 291–309.
Article Google Scholar
Becker, G. S., & Lewis, H. G. (1973). On the interaction between the quantity and quality of children. Journal of Political Economy, 81(2, pt 1), s279–s288.
Google Scholar
Blanchflower, D. G., & Oswald, A. J. (2004). Well-being over time in Britain and the USA. Journal of Public Economics, 88, 1359–1386.
Article Google Scholar
Blanchflower, D. G., & Oswald, A. J. (2005). Happiness and the human development index: The paradox of Australia. Nber Working Paper Series (Working Paper 11416).
Google Scholar
Blanton, H., & Jaccard, J. (2006). Arbitrary metrics in psychology. American Psychologist, 61(1), 27–41.
Article Google Scholar
Borgatta, E. F., & Bohrnstedt, G. W. (1980). Level of measurement—Once over again. Sociological Methods and Research, 9, 147–160.
Article Google Scholar
Brogden, H. E. (1977). The Rasch model, the law of comparative judgement and additive conjoint measurement. Psychometrika, 42, 631–634.
Article Google Scholar
Brulé, G., & Veenhoven, R. (2017). The ’10 excess’ phenomenon in responses to survey questions on happiness. Social Indicators Research, 131(2), 853–870. doi: 10.1007/s11205-016-1265-x.
Article Google Scholar
Bruni, L., & Sugden, R. (2007). The road not taken: How psychology was removed from economics, and how it might be brought back. The Economic Journal, 117(January), 146–173.
Article Google Scholar
Butler, D., Isoni, A., Loomes, G., & Tsutsui, K. (2014). Beyond choice: Investigating the sensitivity and validity of measures of strength of preference. Experimental Economics, 17, 537–563.
Article Google Scholar
Cantril, H. (1965). The pattern of human concerns. New Brunswick: N.J., Rutgers University Press.
Google Scholar
Clark, A. E., Frijters, P., & Shields, M. A. (2008). Relative income, happiness, and utility: An explanation for the Easterlin Paradox and other puzzles. Journal of Economic Literature, 46(1), 95–144.
Article Google Scholar
Clark, A. E., & Oswald, A. J. (1996). Satisfaction and comparison income. Journal of Public Economics, 61, 359–381.
Article Google Scholar
Colander, D. (2007). Edgeworth’s hedonimeter and the quest to measure utility. Journal of Economic Perspectives, 21(2), 215–225.
Article Google Scholar
Conti, G., & Pudney, S. (2008). If you’re happy and you know it, clap your hands! Survey design and the analysis of satisfaction. UK: Institute for Social & Economic Research, University of Essex.
Google Scholar
Cummins, R. A., & Gullone, E. (2000). Why should we not use 5-point Likert scales: The case for subjective quality of life measurement. International Conference on Quality of Life in Cities. Singapore: National University of Singapore.
Google Scholar
Diener, E., Emmons, R. A., Larsen, R. J., & Griffin, S. (1985). The satisfaction with life scale. Journal of Personality Assessment, 49(1), 71–75.
Article Google Scholar
Diener, E., & Suh, E. M. (Eds.). (2000). Culture and subjective well-being. Cambridge, Massachusetts; London, England: The MIT Press.
Google Scholar
Easterlin, R. A. (1974). Does economic growth improve the human lot? Some empirical evidence. In P. A. David & M. W. Reder (Eds.), National and households in economic growth: Essays in honor of Moses Abramovitz (pp. 89–125). New York and London: Academic Press.
Google Scholar
Edgeworth, Y. F. (1881 [1961]). Mathematical psychics: An essay on the application of mathematics to the moral sciences. New York: Augustus M. Kelly.
Google Scholar
Ferrer-i-Carbonell, A., & Frijters, P. (2004). How important is methodology for the estimates of the determinants of happiness? The Economic Journal, 114(July), 641–659.
Article Google Scholar
Gardner, J., & Oswald, A. J. (2001). Does money buy happiness? A longitudinal study using data on windfalls. Warwick, U.K: Warwick University.
Google Scholar
Guttman, L. (1977). What is not what in statistics. The Statistician, 26, 81–107.
Article Google Scholar
Headey, B. (2007). The set-point theory of well-being needs replacing: On the brink of a scientific revolution? DIW Berlin: Discussion papers 753.
Google Scholar
Headey, B., & Wooden, M. (2004). The effects of wealth and income on subjective well-being and ill-being. Economic Record, 80(Special Issue), S24–S33.
Google Scholar
Helliwell, J. F., & Putman, R. D. (1995). Economic growth and social capital in Italy. Eastern Economic Journal, 21, 295–307.
Google Scholar
Hirschauer, N., Lehberger, M., & Musshoff, O. (2014). Happiness and utility in economic thought—Or: What can we learn from happiness research for public policy analysis and public policy making? Social Indicators Research, 121, 647–674.
Article Google Scholar
Kahneman, D., Krueger, A. B., Schkade, D. A., Schwarz, N., & Stone, A. A. (2004). A survey method for characterizing daily life experiences: The day reconstruction method. Science, 306, 1776–1780.
Google Scholar
Kapteyn, A., Smith, J. P., & van Soerst, A. (2007). Vignettes and self-reports of work disability in the United States and the Netherlands. American Economic Review, 97(1), 461–473.
Article Google Scholar
Katzner, D. W. (1998). The misuse of measurement in economics. Metroeconomica, 49(1), 1–22.
Article Google Scholar
Kristoffersen, I. (2010). The metrics of subjective wellbeing: Cardinality, neutrality and additivity. The Economic Record, 86(272), 98–123.
Article Google Scholar
Kristoffersen, I. (2017). The metrics of subjective wellbeing data: An empirical evaluation of the ordinal and cardinal comparability of life satisfaction scores. Social Indicators Research, 130(2), 845–865.
Google Scholar
Larsen, R. J., & Fredrickson, B. L. (1999). Measurement issues in emotional research. In D. Kahneman, E. Diener, & N. Schwarz (Eds.), Well-being: The foundations of hedonic psychology. New York: Russel Sage Foundation.
Google Scholar
Lau, A. L. D. (2007). Measurement of subjective wellbeing: Cultural issues. 9th Quality of Life Conference. Melbourne: Deakin University.
Google Scholar
Lord, F. (1953). On the statistical treatment of football numbers. American Psychologist, 8, 750–751.
Article Google Scholar
Lucas, R. E. (2007). Adaptation and the set-point model of subjective well-being. Psychological Science, 16(2), 75–79.
Google Scholar
Luce, R. D., & Tukey, J. W. (1964). Simultaneous conjoint measurement: A new type of fundamental measurement. Journal of Mathermatical Psychology, 1, 1–27.
Article Google Scholar
Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. The Psychological Review, 63(2), 81–97.
Article Google Scholar
Ng, Y.-K. (1996). Happiness surveys: Some comparability issues and an exploratory survey based on just perceivable increments. Social Indicators Research, 38, 1–27.
Article Google Scholar
Ng, Y.-K. (1997). A case for happiness, cardinalism, and interpersonal comparability. The Economic Journal, 107(445), 1848–1858.
Article Google Scholar
Ng, Y.-K. (2003). From preference to happiness: Towards a more complete welfare economics. Social Choice and Welfare, 20, 307–350.
Article Google Scholar
Ng, Y.-K. (2008). Happiness studies: Ways to improve comparability and some public policy implications. The Economic Record, 84(265), 253–266.
Article Google Scholar
Oswald, A. (2008). On the curvature of the reporting function from objective reality to subjective feelings. Economics Letters, 100(3), 369–372.
Article Google Scholar
Parducci, A. (1995). Happiness, pleasure, and judgment: The contextual theory and its applications. Hillsdale, N.J.: Erlbaum.
Google Scholar
Perneger, T. V., & Bovier, P. A. (2001). Application of the Rasch model to the SF36 mental health 5 item scale (MH5). ISPOR Sixth Annual International Meeting, Value In Health.
Google Scholar
Raczek, A. E., Ware, J. E., Bjorner, J. B., Gandek, B., Haley, S. M., Aaronson, N. K., et al. (1998). Comparisons of Rasch and summated rating scales constructed from SF-36 physical functioning items in seven countries: Results from the IQOLA project. International quality of life assessment. Journal of Clinical Epidemiology, 51(11), 1203–1214.
Article Google Scholar
Rasch, G. (1961). On general laws and the meaning of measurement in psychology. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability. Berkeley, California: University of California Press.
Google Scholar
Scherpenzeel, A. C., & Saris, W. E. (1993). The evaluatin of measurement instruments by meta-analysis of multitrait-multimethod studies. Bulletin de Methodologie Sociologique, 39, 3–9.
Article Google Scholar
Schwartz, N. (1995). What respondents learn from questionnaires: The survey interview and the logic of conversation. International Statistical Review, 63, 153–177.
Article Google Scholar
Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103, 677–680.
Google Scholar
Stevens, S. S. (1951). Mathematics, measurement and psychophysics. In S. S. Stevens (Ed.), Handbook of experimental psychology (1–49). New York, Wiley.
Google Scholar
Stevens, S. S. (1975). Psychophysics. New York: Wiley.
Google Scholar
OECD. (2013). OECD guidelines on measuring subjective well-being. Paris: OECD Publishing. doi:10.1787/9789264191655-en.
Tukey, J. W. (1962). The future of data analysis. In L. V. Jones (Ed.), The collected works of John W. Tukey (Vol. III (1986), pp. 187–389). Belmont, CA: Wadsworth, Inc.
Google Scholar
Van Praag, B. M. S. (1991). Ordinal and cardinal utility: An integration of the two dimensions of the welfare concept. Journal of Econometrics, 50, 69–89.
Article Google Scholar
van Praag, B. M. S. (2007). Perspectives from the happiness literature and the role of new instruments for policy analysis. CESifo Economic Studies, 53(1), 42–68.
Article Google Scholar
Van Praag, B. M. S., & Ferrer-i-Carbonell, A. (2004). Happiness quantified. New York: Oxford University Press.
Book Google Scholar
Veenhoven, R. (1984). Conditions of happiness. Dordrecht: Kluwer Academic.
Book Google Scholar
Velleman, P. F., & Wilkinson, L. (1993). Nominal, ordinal, interval, and ratio typologies are misleading. The American Statistician, 47(1), 65–72.
Google Scholar
Ware, J. E., Snow, K. K., Kosinski, M., & Gandek, B. (2000). SF-36 health survey: Manual and interpretation guide Lincoln. RI: QualityMetric Inc.
Google Scholar
Wright, B. D. (1997). A history of social science measurement. Educational Measurement: Issues and Practice, 16, 33–45. doi:10.1111/j.1745-3992.1997.tb00606.x.

Download references

Author information

Authors and Affiliations

University of Western Australia, Crawley, Australia
Ingebjørg Kristoffersen

Authors

Ingebjørg Kristoffersen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ingebjørg Kristoffersen .

Editor information

Editors and Affiliations

Happiness Studies, Erasmus University Rotterdam Happiness Studies, Rotterdam, The Netherlands
Gaël Brulé
Dipartimento di Scienze Statistiche, Università di Roma "La Sapienza" , Roma, Italy
Filomena Maggino

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kristoffersen, I. (2017). Evaluating Comparability of Survey Data on Subjective Well-being. In: Brulé, G., Maggino, F. (eds) Metrics of Subjective Well-Being: Limits and Improvements. Happiness Studies Book Series. Springer, Cham. https://doi.org/10.1007/978-3-319-61810-4_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-61810-4_8
Published: 03 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61809-8
Online ISBN: 978-3-319-61810-4
eBook Packages: Social SciencesSocial Sciences (R0)

Publish with us

Policies and ethics