Often machine learning programs inherit social patterns reflected in their training data without any directed effort by programmers to include such biases. Computer scientists call this algorithmic bias. This paper explores the relationship between machine bias and human cognitive bias. In it, I argue similarities between algorithmic and cognitive biases indicate a disconcerting sense in which sources of bias emerge out of seemingly innocuous patterns of information processing. The emergent nature of this bias obscures the existence of the bias itself, making it difficult to identify, mitigate, or evaluate using standard resources in epistemology and ethics. I demonstrate these points in the case of mitigation techniques by presenting what I call ‘the Proxy Problem’. One reason biases resist revision is that they rely on proxy attributes, seemingly innocuous attributes that correlate with socially-sensitive attributes, serving as proxies for the socially-sensitive attributes themselves. I argue that in both human and algorithmic domains, this problem presents a common dilemma for mitigation: attempts to discourage reliance on proxy attributes risk a tradeoff with judgement accuracy. This problem, I contend, admits of no purely algorithmic solution.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
See, for example, O’Neil (2016, p. 154)’s discussion of discriminatory errors in Google’s automatic photo-tagging service.
Lowry and Macpherson (1988), p. 657.
See also Wu and Zhang (2016).
For a comprehensive overview of the current state of affairs regarding machine learning programs in social technology, see O’Neil (2016).
See, for example, Eberhardt et al. (2004).
Angwin et al. (2016). The exposé concerned the computer software COMPAS (Correctional Offender Management Profiling for Alternative Sanctions). COMPAS has received an enormous amount of attention in philosophy, machine learning, and everyday discussions of algorithmic bias, much of which is beyond the scope of this paper. Important for my purposes, however, is the fact that many of the issues it raises are not unique either to it or to algorithmic decision-making in general. For example, recent work by Kleinberg et al. (2016) and Chouldechova (2016) identifies the three intuitive conditions any risk-assessment program must achieve in order to be fair and unbiased. These criteria include first, that the algorithm is well-calibrated, i.e., if it identifies a set of people as having a probability z of constituting positive instances, then approximately a z fraction of this set should indeed be positive instances; second, that it balance the positive class, i.e., the average score received by people constituting positive instances should be the same in each group; and third, that it balance the negative class, i.e., the average score received by people constituting the negative instances should be the same in each group (Kleinberg et al. 2016, p. 2). Strikingly, Kleinberg et al. (2016) demonstrate that in cases where base rates differ and our programs are not perfect predictors—which subsumes most cases—these three conditions necessarily trade off from one another. This means most (if not all) programs used in real-world scenarios will fail to satisfy all three fairness conditions. There is no such thing as an unbiased program in this sense. More strikingly, researchers take this so-called ‘impossibility result’ to generalize to all predictors, whether they be algorithmic or human decision-makers (Kleinberg et al. 2016, p. 6; Miconi 2017, p. 4). For a compelling argument of how best to prioritize different notions of fairness, see Hellman (2019).
Ideally, I would present a walkthrough of the operation of one of the algorithms discussed above. Unfortunately, providing a detailed analysis of one of these algorithms is difficult, if not impossible, since information relating to the operation of commercial machine learning programs is often intentionally inaccessible for the sake of competitive commercial advantage or client security purposes. Even if these algorithms were made available for public scrutiny, many likely instantiate so-called ‘black-box’ algorithms, i.e., those where it is difficult if not impossible for human programmers to understand or explain why any particular outcome occurs. This lack of transparency with respect to their operation creates a myriad of concerning questions about fairness, objectivity, and accuracy. However, these issues are beyond the scope of this discussion.
The problem of which features are most relevant for some classification task will itself raise important philosophical issues, some of which I return to in my discussion of the Proxy Problem in Sect. 4.
This case is discussed in more detail by Daum III (2015, pp. 30–32).
See Johnson (2020).
This example and parts of its description are borrowed from Johnson (2020).
See Holroyd and Sweetman (2016) for discussion and examples.
Corneille and Hutter (2020) survey conceptual ambiguities in the use of ‘implicit’ in attitude research. According to them, among the possible interpretations present in standard literature are implicit as automatic, implicit as indirect, and implicit as associative (as well as hybrids). The interpretation of implicit as truly implicit, i.e., non-representational, is largely ignored or altogether absent from these interpretations. See also Holroyd et al. (2017), pp. 3–7.
More standard in cases like this, she might instead offer an explanation that, upon investigation, is revealed to be confabulation.
See Johnson (2020).
I discuss these points in greater detail in Johnson (2020), following considerations presented by Antony (2001, 2016). For other arguments in favor of and against including normative and accuracy conditions in the definition of bias and the related notion stereotype, see Munton (2019b), Beeghly (2015), and Blum (2004).
I return to these points at the end of §4.
In the case of implicit human cognitive bias, this point about flexibility is often raised under the heading of the heterogeneity of bias. See, in particular, Holroyd and Sweetman (2016).
See, for example, Adler et al. (2016)’s discussion of how to audit so-called ‘black box’ algorithms that appear to rely on proxy attributes in lieu of target attributes.
To take an example from earlier, the recidivism algorithm COMPAS produces patterns discriminatory toward African Americans despite race demographic information not being explicitly included in the information given to it about each defendant. Instead, it has access to other features that collectively correlated with race.
For reasons soon to be discussed, programmers are forced to rely on proxies. Moreover, the notion of proxy discrimination is familiar in discussions of discrimination in ethics and law. See, for example, Alexander (1992), pp. 167–173.
The median age of Fox News viewers is 65 (Epstein 2016). Although age might not correlate exactly with perceived trustworthiness, we can assume for simplicity that it does.
This feature space is meant to be a simplification and is not intended to exactly match real-world data.
Cognitive examples where some attributes can stand in for others (and where this is taken to reflect some quirk underwriting heuristic judgements) are well-studied in empirical work on ‘attribute substitution effects’. According to the theory, there are some instances where decisions made on the basis of a target attribute—familiarity, say—are too cognitively demanding. This prompts the cognitive system to shift to a less cognitively demanding task that relies on some irrelevant, but more easily identifiable attribute—e.g., beauty. For discussion, see Kahneman and Frederick (2002) and Monin (2003).
This is a perfectly adequate first-pass analysis of the notion of proxy as it is used in machine learning. See, for example, Eubanks (2018, pp. 143–145, 168)’s discussion of using community calls to a child abuse hotline as a proxy for actual child abuse. However, I believe the key features of proxies that make them a useful concept in our explanations about discrimination go deeper than this first-pass analysis. Ultimately, I take the philosophically rich features of proxy discrimination to stem from an externalism about intentionality and anti-individualism about the natures of representational states, points I discuss further in my “Proxies Aren’t Intentional, They’re Intentional” (MS).
See Byrd (2019) for a review of the effectiveness of cognitive debiasing strategies that utilize counter-stereotypical training (i.e., counterconditioning) and what this effectiveness entails for theories of social cognitive bias-constructs.
It additionally bolsters the point that combating problematic biases will require a two-pronged solution that focuses both on the individual and structural level. See Antony (2016) and Huebner (2016), as well as the debate between so-called ‘structural prioritizers’ and ‘anti-anti-individualists’, including Haslanger (2015, 2016b, 2016a), Ayala Lopez (2016, 2018), Ayala Lopez and Vasilyeva (2015), Ayala Lopez and Beeghly (2020), Madva (2016), and Soon (2019).
Wilhelm et al. (2018) put the percentage between 14 and 16%, despite women constituting closer to 25% of all philosophy faculty and 50% of the general population. See also analyses of the number of publications in philosophy by female authors (and the underrepresentation of women in philosophy in general) presented in Paxton et al. (2012), Saul (2013), Jennings and Schwitzgebel (2017), Nagel et al. (2018), and Kings (2019).
I am forced to simplify a complex discussion about what makes for “accurate judgements of being a worthwhile candidate”. For reasons this discussion brings out, traditional metrics of academic success, e.g., standards for tenure promotion, will have baked into them (potentially problematic) preconceptions that will often track structural inequalities. Thus, to be “accurate” with respect to those standards will entail alignment with problematic social trends. We could adopt different measures of success, e.g., contributions to diversity, that don’t have baked into them those same preconceptions. However, these other metrics would arguably have baked into them some other preconceptions that are themselves morally- and politically-laden. I discuss in more detail how seemingly neutral notions like ‘accuracy’ are potentially value-laden in my “Are Algorithms Value-Free?” (MS). Thanks to an anonymous referee for pushing me to be more explicit about these points and for noting how the idea of a “forced choice” between inclusivity and excellence likely requires an overly narrow conception of excellence. See also Stewart and Valian (2018), pp. 212–213.
Indeed, as an anonymous referee points out, this is arguably evidence that a “colorblind” approach that attempts to ignore socially-sensitive features is misguided to begin with. Alternative to this approach could be to explicitly code for the socially-sensitive features, i.e., allow explicit reference to features like race in the program, so as to overtly counter-balance the effects caused by the Proxy Problem. This follows Anderson (2010, pp. 155–156)’s discussion of how color-conscious policies are required in order to end race-based injustice. I agree, but leave more detailed consideration of these alternative strategies for future work.
I have many to thank for the development of this paper over the years. Firstly, thanks to Tatyana Kostochka for drawing the illustrations used to make the figures. For comments and discussion about the paper, thanks to Josh Armstrong, Rima Basu, Erin Beeghly, Renee Bolinger, Michael Brownstein, Elisabeth Camp, David Chalmers, Sam Cumming, Gabe Dupre, Tina Eliassi-Rad, Maegan Fairchild, Branden Fitelson, Daniel Fogal, Deborah Hellman, Pamela Hieronymi, Justin Humphreys, Amber Kavka-Warren, Seth Lazar, Travis LaCroix, Dustin Locke, Alex Madva, Eric Mandelbaum, Annette Martin, Jessie Munton, Eleonore Neufeld, Sara Protasi, Chelsea Rosenthal, Ronni Gura Sadovsky, Ayana Samuel, Susanna Schellenberg, Susanna Siegel, Seana Shiffrin, Joy Shim, and Annette Zimmermann. Previous versions of this paper were presented at and received valuable feedback from the Vancouver Summer Philosophy Conference at the University of British Columbia, Athena in Action at Princeton University, Philosophy of Machine Learning: Knowledge and Causality at University of California, Irvine, and Bias in Context Four at the University of Utah. Finally, I want to acknowledge the helpful suggestions received from Nick Byrd and an anonymous referee at Synthese.
Adler, P., Falk, C., Friedler, S. A., Rybeck, G., Scheidegger, C., Smith, B., & Venkatasubramanian, S. (2016). Auditing black-box models for indirect influence. In 2016 IEEE 16th international conference on data mining (ICDM) (pp. 1–10). IEEE.
Alexander, L. (1992). What makes wrongful discrimination wrong? Biases, preferences, stereotypes, and proxies. University of Pennsylvania Law Review, 141(1), 149.
Anderson, E. (2010). The imperative of integration. Princeton: Princeton University Press.
Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016). Machine bias: There’s software used across the country to predict future criminals. And it’s biased against blacks. New York: ProPublica.
Antony, L. (2001). Quine as feminist: The radical import of naturalized epistemology. In L. Antony & C. E. Witt (Eds.), A mind of one’s own: Feminist essays on reason and objectivity (pp. 110–153). Boulder: Westview Press.
Antony, L. (2016). Bias: friend or foe? In M. Brownstein & J. Saul (Eds.), Implicit bias and philosophy (Vol. 1, pp. 157–190)., Metaphysics and epistemology Oxford: Oxford University Press.
Ayala Lopez, S. (2016). Comments on Alex Madvas ‘A plea for anti-anti-individualism: How oversimple psychology mislead social policy’. In Ergo symposium.
Ayala Lopez, S. (2018). A structural explanation of injustice in conversations: It’s about norms. Pacific Philosophical Quarterly, 99(4), 726–748.
Ayala Lopez, S., & Beeghly, E. (2020). Explaining injustice: Structural analysis, bias, and individuals. In E. Beeghly & A. Madva (Eds.), Introduction to implicit bias: Knoweldge, justice, and the social mind. Abingdon: Routledge.
Ayala Lopez, S., & Vasilyeva, N. (2015). Explaining injustice in speech: Individualistic vs structural explanation. In R. Dale, C. Jennings, P. P. Maglio, T. Matlock, D. C. Noelle, A. Warlaumont, & J. Yoshimi (Eds.), Proceedings of the 37th annual conference of the Cognitive Science Society (pp. 130–135). Austin: Cognitive Science Society.
Basu, R. (2019a). The wrongs of racist beliefs. Philosophical Studies, 176(9), 2497–2515.
Basu, R. (2019b). What we epistemically owe to each other. Philosophical Studies, 176(4), 915–931.
Beeghly, E. (2015). What is a stereotype? What is stereotyping? Hypatia, 30(4), 675–691.
Blum, L. (2004). Stereotypes and stereotyping: A moral analysis. Philosophical Papers, 33(3), 251–289.
Bolinger, R. J. (2018). The rational impermissibility of accepting (some) racial generalizations. Synthese, 1–17.
Byrd, N. (2019). What we can (and cant) infer about implicit bias from debiasing experiments. Synthese, 1–29.
Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183–186.
Chouldechova, A. (2016). Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. arXiv preprint arXiv:1610.07524.
Corneille, O. & Hutter, M. (2020). Implicit? What do you mean? A comprehensive review of the delusive implicitness construct in attitude research. Personality and Social Psychology Review, 108886832091132.
Datta, A., Tschantz, M. C., & Datta, A. (2015). Automated Experiments on Ad Privacy Settings. In Proceedings on privacy enhancing technologies (Vol. 2015(1)).
Daum III, H. (2015). A Course in Machine Learning. https://ciml.info/.
Eberhardt, J. L., Goff, P. A., Purdie, V. J., & Davies, P. G. (2004). Seeing black: Race, crime, and visual processing. Journal of Personality and Social Psychology, 87(6), 876–893.
Epstein, A. (2016). Fox News’s biggest problem isn’t the Ailes ouster, it’s that it’s average viewer is a dinosaur. New York: Quartz Media.
Eubanks, V. (2018). Automating inequality: How high-tech tools profile, police and punish the poor. New York: St. Martin’s Press.
Gawronski, B., Hofmann, W., & Wilbur, C. J. (2006). Are implicit attitudes unconscious? Consciousness and Cognition, 15(3), 485–499.
Gendler, T. S. (2011). On the epistemic costs of implicit bias. Philosophical Studies, 156(1), 33–63.
Hahn, A., Judd, C. M., Hirsh, H. K., & Blair, I. V. (2014). Awareness of implicit attitudes. Journal of Experimental Psychology: General, 143(3), 1369–1392.
Haslanger, S. (2015). Social structure, narrative, and explanation. Canadian Journal of Philosophy, 45(1), 1–15.
Haslanger, S. (2016a). Comments on Alex Madvas ‘A plea for anti-anti-individualism: How oversimple psychology mislead social policy’. In Ergo symposium.
Haslanger, S. (2016b). What is a (social) structural explanation? Philosophical Studies, 173(1), 113–130.
Hellman, D. (2019). Measuring Algorithmic Fairness. Virginia Public Law and Legal Theory Research Paper, 2019, 39.
Holroyd, J., Scaife, R., & Stafford, T. (2017). What is implicit bias? Philosophy Compass, 12(10), e12437.
Holroyd, J., & Sweetman, J. (2016). The heterogeneity of implicit bias. In M. Brownstein & J. Saul (Eds.), Implicit bias and philosophy volume 1: metaphysics and epistemology (pp. 80–103). Oxford: Oxford University Press.
Huebner, B. (2016). Implicit bias, reinforcement learning, and scaffolded moral cognition. In Brownstein, M. and Saul, J. (Eds.), Implicit bias and philosophy volume 1: Metaphysics and epistemology (pp. 47–79). Oxford: Oxford University Press (Forthcoming in Brownstein and Saul, eds. Implicit Bias and Philosophy Volume I: Metaphysics and Epistemology. Oxford: Oxford University Press)
Jennings, C., & Schwitzgebel, E. (2017). Women in philosophy: Quantitative analyses of specialization, prevalence, visibility, and generational change. Public Affairs Quarterly, 31, 83–105.
Johnson, G. M. (2020). The structure of bias. Mind. https://doi.org/10.1093/mind/fzaa011.
Kahneman, D., & Frederick, S. (2002). Representativeness revisited: Attribute substitution in intuitive judgment. In T. Gilovich, D. Griffin, & D. Kahneman (Eds.), Heuristics and Biases: The psychology of intuitive judgement (1st ed., pp. 49–81). Cambridge: Cambridge University Press.
Kings, A. E. (2019). Philosophys diversity problem: Understanding the underrepresentation of women and minorities in philosophy. Metaphilosophy, 50(3), 212–230.
Klare, B. F., Burge, M. J., Klontz, J. C., Vorder Bruegge, R. W., & Jain, A. K. (2012). Face recognition performance: Role of demographic information. IEEE Transactions on Information Forensics and Security, 7(6), 1789–1801.
Kleinberg, J., Mullainathan, S., & Raghavan, M. (2016). Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807.
Lowry, S., & Macpherson, G. (1988). A blot on the profession. British Medical Journal (Clinical Research ed.), 296(6623), 657.
Madva, A. (2016). A plea for anti-anti-individualism: How oversimple psychology misleads social policy. Ergo, an Open Access Journal of Philosophy, 3, 701.
Miconi, T. (2017). The impossibility of “fairness”: a generalized impossibility result for decisions. arXiv:1707.01195.
Monin, B. (2003). The warm glow heuristic: When liking leads to familiarity. Journal of Personality and Social Psychology, 85(6), 1035–1048.
Munton, J. (2019a). Beyond accuracy: Epistemic flaws with statistical generalizations. Philosophical Issues, 29(1), 228–240.
Munton, J. (2019b). Bias in a biased system: Visual perceptual prejudice. In Bias, reason and enquiry: New perspectives from the crossroads of epistemology and psychology. Oxford: Oxford University Press.
Nagel, M., Peppers-Bates, S., Leuschner, A., & Lindemann, A. (2018). Feminism and philosophy. The American Philosophical Association, 17(2), 33.
Narayanan, A. (2016). Language necessarily contains human biases, and so will machines trained on language corpora. Freedom to Tinker. https://freedom-totinker.com/2016/08/24/language-necessarily-contains-human-biases-and-so-will-machines-trained-on-language-corpora/.
O’Neil, C. (2016). Weapons of math destruction: How big data increases inequality and threatens democracy. New York: Crown Publishing Group.
Paxton, M., Figdor, C., & Tiberius, V. (2012). Quantifying the gender gap: An empirical study of the underrepresentation of women in philosophy. Hypatia, 27(4), 949–957.
Price, R. (2016). Microsoft is deleting its AI chatbot’s incredibly racist tweets. New York: Business Insider.
Saul, J. (2013). Implicit bias, stereotype threat, and women in philosophy. In K. Hutchison & F. Jenkins (Eds.), Women in philosophy: What needs to change? (pp. 39–60). Oxford: Oxford University Press.
Soon, V. (2019). Implicit bias and social schema: A transactive memory approach. Philosophical Studies, 1–21.
Stephens-Davidowitz, S. (2014). Opinion|Google, tell me. Is my son a genius?. New York: The New York Times.
Stewart, A. J., & Valian, V. (2018). An inclusive academy: Achieving diversity and excellence. Cambridge: MIT Press.
Wilhelm, I., Conklin, S. L., & Hassoun, N. (2018). New data on the representation of women in philosophy journals: 20042015. Philosophical Studies, 175(6), 1441–1464.
Wu, X. & Zhang, Z. (2016). Automated inference on criminality using face images. arXiv preprint arXiv:1611.04135.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Johnson, G.M. Algorithmic bias: on the implicit biases of social technology. Synthese (2020). https://doi.org/10.1007/s11229-020-02696-y
- Algorithmic bias
- Social bias
- Machine bias
- Implicit bias