Preference discovery


Is the assumption that people automatically know their own preferences innocuous? We present an experiment studying the limits of preference discovery. If tastes must be learned through experience, preferences for some goods may never be learned because it is costly to try new things, and thus non-learned preferences may cause welfare loss. We conduct an online experiment in which finite-lived participants have an induced utility function over fictitious goods about whose marginal utilities they have initial guesses. Subjects learn most, but not all, of their preferences eventually. Choice reversals occur, but primarily in early rounds. Subjects slow their sampling of new goods over time, supporting our conjecture that incomplete learning can persist. Incomplete learning is more common for goods that are rare, have low initial value guesses, or appear in choice sets alongside goods that appear attractive. It is also more common for people with lower incomes or shorter lifetimes. More noise in initial value guesses has opposite effects for low-value and high-value goods because it affects the perceived likelihood that the good is worth trying. Over time, subjects develop a pessimistic bias in beliefs about goods’ values, since optimistic errors are more likely to be corrected. Overall, our results show that if people need to learn their preferences through consumption experience, that learning process will cause choice reversals, and even when a person has completed sampling the goods she is willing to try, she may continue to lose welfare because of suboptimal choices that arise from non-learned preferences.

Fig. 1
Fig. 2
Fig. 3


  1. 1.

    We distinguish between learning about objective circumstances and learning about one’s own tastes, which Braga and Starmer (2005) refer to as “institutional learning” and “value learning” respectively. Our focus is on value learning, so we assume the agent knows the objective features of all goods. Institutional learning is best separately modeled, e.g., in experimental consumption models (Kihlstrom et al 1984) or the two-armed bandit problem (Rothschild 1974).

  2. 2.

    Brezzi and Lai (2000) show, in another theoretical study, that learning when facing a multiple-armed bandit will be incomplete, but it is for a different reason (discounting) than what we study.

  3. 3.

    Chuang and Schechter (2015) find, in developing country contexts, very little stability in preferences within a person over years, except in survey measures of self-reported social preferences. However, their interpretation is that the experimental measures they study are not good measures of preferences in these contexts.

  4. 4.

    In a related paper, Delaney et al (2019), we develop a formal theory exploring the extensive margin of preference discovery.

  5. 5.

    Problems with the server caused fatal timeouts for some potential subjects. Of the 606 who did not complete the experiment, 547 (90.3%) had made no choices by the time they stopped. Most of these likely had server timeouts.

  6. 6.

    The post-experiment questionnaire asked a comprehension question that posed a simplified version of the experiment’s choice problem. 82.4% of subjects answered correctly. Including only those who answered correctly produces qualitatively identical results except that the Mann–Whitney test for the effect of noise on efficiency becomes non-significant and the effect of noise on efficiency becomes significant at the 10% level in the Tobit regression for \(T = 10\). This paper reports results from the full sample of subjects.

  7. 7.

    Subjects made other non-maximizing choices as well. Of 103,950 good-round pairs, subjects chose a value between 0 and 1 (less than a meaningful consumption experience) 322 times (or 0.3% of the time), and a value less than 0 a total of 14 times (less than 0.1% of the time). 99.3% of the time, subjects chose an integer between 0 and 6.

  8. 8.

    Choices inconsistent with believed preferences have little impact on our experiment’s results. Excluding subjects who make these non-myopic-maximizing choices yields qualitatively similar results with only a few changes: In Table 3, the difference in efficiency across income levels becomes marginally significant (\(p = 0.090\)), while the difference in efficiency across noise levels ceases to be statistically significant (\(p = 0.118\)). In Table 4, the number of remaining rounds becomes significant at \(p < 0.001\) in the second model, while in the third model, noise becomes marginally significant (\(p = 0.064\)) as does income (\(p = 0.063\)).

  9. 9.

    Recall that the numeraire value is known with certainty and thus is always “learned.” Results are similar excluding the numeraire. We also considered a version of specification III that used individual period dummies and found qualitatively similar results.

  10. 10.

    The calculation of this threshold is shown in the electronic supplementary material. A key assumption is that in all future periods, only the numeraire good and this good will be available. As we show in the online appendix, an agent has achieved maximally experimental full relevant discovery according to this definition if she has tried all goods i with priors at least as large as \(X_i^t \,=\, 64.5 - \sigma + \sqrt{240.25-240\cdot \frac{q_i \cdot y \cdot (T-t)}{1+q_i \cdot y \cdot (T-t)}}\). Note that this threshold moves higher as t progresses.


We are grateful for helpful comments from the editor and two anonymous referees. For advice early on, we thank Yongsheng Xu, Annemie Maertens, and participants at FUR 2012, SABE/IAREP/ICABEEP 2013, and seminars at Williams College and George Mason University, and we particularly thank CeMENT 2014 participants Brit Grosskopf, Muriel Niederle, J. Aislinn Bohren, Angela de Oliveira, Jessica Hoel, and Jian Li for detailed feedback. We gratefully acknowledge funding from the Williams College Hellman Fellows Grant.

Delaney, J., Jacobson, S. & Moenig, T. Preference discovery. Exp Econ 23, 694–715 (2020).

  • Discovered preferences
  • Preference stability
  • Learning

JEL Classification

  • D81
  • D83
  • D01
  • D03