Algorithmic bias: on the implicit biases of social technology

Johnson, Gabbrielle M.

doi:10.1007/s11229-020-02696-y

Algorithmic bias: on the implicit biases of social technology

Published: 20 June 2020

Volume 198, pages 9941–9961, (2021)
Cite this article

Synthese Aims and scope Submit manuscript

Gabbrielle M. Johnson ORCID: orcid.org/0000-0003-1463-4496¹

9027 Accesses
36 Citations
18 Altmetric
2 Mentions
Explore all metrics

Abstract

Often machine learning programs inherit social patterns reflected in their training data without any directed effort by programmers to include such biases. Computer scientists call this algorithmic bias. This paper explores the relationship between machine bias and human cognitive bias. In it, I argue similarities between algorithmic and cognitive biases indicate a disconcerting sense in which sources of bias emerge out of seemingly innocuous patterns of information processing. The emergent nature of this bias obscures the existence of the bias itself, making it difficult to identify, mitigate, or evaluate using standard resources in epistemology and ethics. I demonstrate these points in the case of mitigation techniques by presenting what I call ‘the Proxy Problem’. One reason biases resist revision is that they rely on proxy attributes, seemingly innocuous attributes that correlate with socially-sensitive attributes, serving as proxies for the socially-sensitive attributes themselves. I argue that in both human and algorithmic domains, this problem presents a common dilemma for mitigation: attempts to discourage reliance on proxy attributes risk a tradeoff with judgement accuracy. This problem, I contend, admits of no purely algorithmic solution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Article Open access 01 April 2016

Plagiarism in research

Article 04 July 2014

A random forest guided tour

Article 19 April 2016

Notes

Price (2016).
See, for example, O’Neil (2016, p. 154)’s discussion of discriminatory errors in Google’s automatic photo-tagging service.
Stephens-Davidowitz (2014).
Narayanan (2016).
Lowry and Macpherson (1988), p. 657.
See also Wu and Zhang (2016).
For a comprehensive overview of the current state of affairs regarding machine learning programs in social technology, see O’Neil (2016).
See, for example, Eberhardt et al. (2004).
Angwin et al. (2016). The exposé concerned the computer software COMPAS (Correctional Offender Management Profiling for Alternative Sanctions). COMPAS has received an enormous amount of attention in philosophy, machine learning, and everyday discussions of algorithmic bias, much of which is beyond the scope of this paper. Important for my purposes, however, is the fact that many of the issues it raises are not unique either to it or to algorithmic decision-making in general. For example, recent work by Kleinberg et al. (2016) and Chouldechova (2016) identifies the three intuitive conditions any risk-assessment program must achieve in order to be fair and unbiased. These criteria include first, that the algorithm is well-calibrated, i.e., if it identifies a set of people as having a probability z of constituting positive instances, then approximately a z fraction of this set should indeed be positive instances; second, that it balance the positive class, i.e., the average score received by people constituting positive instances should be the same in each group; and third, that it balance the negative class, i.e., the average score received by people constituting the negative instances should be the same in each group (Kleinberg et al. 2016, p. 2). Strikingly, Kleinberg et al. (2016) demonstrate that in cases where base rates differ and our programs are not perfect predictors—which subsumes most cases—these three conditions necessarily trade off from one another. This means most (if not all) programs used in real-world scenarios will fail to satisfy all three fairness conditions. There is no such thing as an unbiased program in this sense. More strikingly, researchers take this so-called ‘impossibility result’ to generalize to all predictors, whether they be algorithmic or human decision-makers (Kleinberg et al. 2016, p. 6; Miconi 2017, p. 4). For a compelling argument of how best to prioritize different notions of fairness, see Hellman (2019).
Ideally, I would present a walkthrough of the operation of one of the algorithms discussed above. Unfortunately, providing a detailed analysis of one of these algorithms is difficult, if not impossible, since information relating to the operation of commercial machine learning programs is often intentionally inaccessible for the sake of competitive commercial advantage or client security purposes. Even if these algorithms were made available for public scrutiny, many likely instantiate so-called ‘black-box’ algorithms, i.e., those where it is difficult if not impossible for human programmers to understand or explain why any particular outcome occurs. This lack of transparency with respect to their operation creates a myriad of concerning questions about fairness, objectivity, and accuracy. However, these issues are beyond the scope of this discussion.
The problem of which features are most relevant for some classification task will itself raise important philosophical issues, some of which I return to in my discussion of the Proxy Problem in Sect. 4.
This case is discussed in more detail by Daum III (2015, pp. 30–32).
See Johnson (2020).
This example and parts of its description are borrowed from Johnson (2020).
See Holroyd and Sweetman (2016) for discussion and examples.
Recently theorists including Gawronski et al. (2006) and Hahn et al. (2014) have disputed that implicit biases are in fact unconscious. I don’t take up the dispute in detail here because it is largely irrelevant for my claims regarding the representational nature of bias-constructs.
Corneille and Hutter (2020) survey conceptual ambiguities in the use of ‘implicit’ in attitude research. According to them, among the possible interpretations present in standard literature are implicit as automatic, implicit as indirect, and implicit as associative (as well as hybrids). The interpretation of implicit as truly implicit, i.e., non-representational, is largely ignored or altogether absent from these interpretations. See also Holroyd et al. (2017), pp. 3–7.
More standard in cases like this, she might instead offer an explanation that, upon investigation, is revealed to be confabulation.
See Johnson (2020).
I discuss these points in greater detail in Johnson (2020), following considerations presented by Antony (2001, 2016). For other arguments in favor of and against including normative and accuracy conditions in the definition of bias and the related notion stereotype, see Munton (2019b), Beeghly (2015), and Blum (2004).
I return to these points at the end of §4.
In the case of implicit human cognitive bias, this point about flexibility is often raised under the heading of the heterogeneity of bias. See, in particular, Holroyd and Sweetman (2016).
See, for example, Adler et al. (2016)’s discussion of how to audit so-called ‘black box’ algorithms that appear to rely on proxy attributes in lieu of target attributes.
To take an example from earlier, the recidivism algorithm COMPAS produces patterns discriminatory toward African Americans despite race demographic information not being explicitly included in the information given to it about each defendant. Instead, it has access to other features that collectively correlated with race.
For reasons soon to be discussed, programmers are forced to rely on proxies. Moreover, the notion of proxy discrimination is familiar in discussions of discrimination in ethics and law. See, for example, Alexander (1992), pp. 167–173.
The median age of Fox News viewers is 65 (Epstein 2016). Although age might not correlate exactly with perceived trustworthiness, we can assume for simplicity that it does.
This feature space is meant to be a simplification and is not intended to exactly match real-world data.
Cognitive examples where some attributes can stand in for others (and where this is taken to reflect some quirk underwriting heuristic judgements) are well-studied in empirical work on ‘attribute substitution effects’. According to the theory, there are some instances where decisions made on the basis of a target attribute—familiarity, say—are too cognitively demanding. This prompts the cognitive system to shift to a less cognitively demanding task that relies on some irrelevant, but more easily identifiable attribute—e.g., beauty. For discussion, see Kahneman and Frederick (2002) and Monin (2003).
This is a perfectly adequate first-pass analysis of the notion of proxy as it is used in machine learning. See, for example, Eubanks (2018, pp. 143–145, 168)’s discussion of using community calls to a child abuse hotline as a proxy for actual child abuse. However, I believe the key features of proxies that make them a useful concept in our explanations about discrimination go deeper than this first-pass analysis. Ultimately, I take the philosophically rich features of proxy discrimination to stem from an externalism about intentionality and anti-individualism about the natures of representational states, points I discuss further in my “Proxies Aren’t Intentional, They’re Intentional” (MS).
See Byrd (2019) for a review of the effectiveness of cognitive debiasing strategies that utilize counter-stereotypical training (i.e., counterconditioning) and what this effectiveness entails for theories of social cognitive bias-constructs.
It additionally bolsters the point that combating problematic biases will require a two-pronged solution that focuses both on the individual and structural level. See Antony (2016) and Huebner (2016), as well as the debate between so-called ‘structural prioritizers’ and ‘anti-anti-individualists’, including Haslanger (2015, 2016b, 2016a), Ayala Lopez (2016, 2018), Ayala Lopez and Vasilyeva (2015), Ayala Lopez and Beeghly (2020), Madva (2016), and Soon (2019).
Wilhelm et al. (2018) put the percentage between 14 and 16%, despite women constituting closer to 25% of all philosophy faculty and 50% of the general population. See also analyses of the number of publications in philosophy by female authors (and the underrepresentation of women in philosophy in general) presented in Paxton et al. (2012), Saul (2013), Jennings and Schwitzgebel (2017), Nagel et al. (2018), and Kings (2019).
I am forced to simplify a complex discussion about what makes for “accurate judgements of being a worthwhile candidate”. For reasons this discussion brings out, traditional metrics of academic success, e.g., standards for tenure promotion, will have baked into them (potentially problematic) preconceptions that will often track structural inequalities. Thus, to be “accurate” with respect to those standards will entail alignment with problematic social trends. We could adopt different measures of success, e.g., contributions to diversity, that don’t have baked into them those same preconceptions. However, these other metrics would arguably have baked into them some other preconceptions that are themselves morally- and politically-laden. I discuss in more detail how seemingly neutral notions like ‘accuracy’ are potentially value-laden in my “Are Algorithms Value-Free?” (MS). Thanks to an anonymous referee for pushing me to be more explicit about these points and for noting how the idea of a “forced choice” between inclusivity and excellence likely requires an overly narrow conception of excellence. See also Stewart and Valian (2018), pp. 212–213.
Indeed, as an anonymous referee points out, this is arguably evidence that a “colorblind” approach that attempts to ignore socially-sensitive features is misguided to begin with. Alternative to this approach could be to explicitly code for the socially-sensitive features, i.e., allow explicit reference to features like race in the program, so as to overtly counter-balance the effects caused by the Proxy Problem. This follows Anderson (2010, pp. 155–156)’s discussion of how color-conscious policies are required in order to end race-based injustice. I agree, but leave more detailed consideration of these alternative strategies for future work.
Gendler (2011), Basu (2019a, b), Bolinger (2018), and Munton (2019a), among others.
I have many to thank for the development of this paper over the years. Firstly, thanks to Tatyana Kostochka for drawing the illustrations used to make the figures. For comments and discussion about the paper, thanks to Josh Armstrong, Rima Basu, Erin Beeghly, Renee Bolinger, Michael Brownstein, Elisabeth Camp, David Chalmers, Sam Cumming, Gabe Dupre, Tina Eliassi-Rad, Maegan Fairchild, Branden Fitelson, Daniel Fogal, Deborah Hellman, Pamela Hieronymi, Justin Humphreys, Amber Kavka-Warren, Seth Lazar, Travis LaCroix, Dustin Locke, Alex Madva, Eric Mandelbaum, Annette Martin, Jessie Munton, Eleonore Neufeld, Sara Protasi, Chelsea Rosenthal, Ronni Gura Sadovsky, Ayana Samuel, Susanna Schellenberg, Susanna Siegel, Seana Shiffrin, Joy Shim, and Annette Zimmermann. Previous versions of this paper were presented at and received valuable feedback from the Vancouver Summer Philosophy Conference at the University of British Columbia, Athena in Action at Princeton University, Philosophy of Machine Learning: Knowledge and Causality at University of California, Irvine, and Bias in Context Four at the University of Utah. Finally, I want to acknowledge the helpful suggestions received from Nick Byrd and an anonymous referee at Synthese.

References

Adler, P., Falk, C., Friedler, S. A., Rybeck, G., Scheidegger, C., Smith, B., & Venkatasubramanian, S. (2016). Auditing black-box models for indirect influence. In 2016 IEEE 16th international conference on data mining (ICDM) (pp. 1–10). IEEE.
Alexander, L. (1992). What makes wrongful discrimination wrong? Biases, preferences, stereotypes, and proxies. University of Pennsylvania Law Review, 141(1), 149.
Article Google Scholar
Anderson, E. (2010). The imperative of integration. Princeton: Princeton University Press.
Book Google Scholar
Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016). Machine bias: There’s software used across the country to predict future criminals. And it’s biased against blacks. New York: ProPublica.
Google Scholar
Antony, L. (2001). Quine as feminist: The radical import of naturalized epistemology. In L. Antony & C. E. Witt (Eds.), A mind of one’s own: Feminist essays on reason and objectivity (pp. 110–153). Boulder: Westview Press.
Google Scholar
Antony, L. (2016). Bias: friend or foe? In M. Brownstein & J. Saul (Eds.), Implicit bias and philosophy (Vol. 1, pp. 157–190)., Metaphysics and epistemology Oxford: Oxford University Press.
Chapter Google Scholar
Ayala Lopez, S. (2016). Comments on Alex Madvas ‘A plea for anti-anti-individualism: How oversimple psychology mislead social policy’. In Ergo symposium.
Ayala Lopez, S. (2018). A structural explanation of injustice in conversations: It’s about norms. Pacific Philosophical Quarterly, 99(4), 726–748.
Article Google Scholar
Ayala Lopez, S., & Beeghly, E. (2020). Explaining injustice: Structural analysis, bias, and individuals. In E. Beeghly & A. Madva (Eds.), Introduction to implicit bias: Knoweldge, justice, and the social mind. Abingdon: Routledge.
Google Scholar
Ayala Lopez, S., & Vasilyeva, N. (2015). Explaining injustice in speech: Individualistic vs structural explanation. In R. Dale, C. Jennings, P. P. Maglio, T. Matlock, D. C. Noelle, A. Warlaumont, & J. Yoshimi (Eds.), Proceedings of the 37th annual conference of the Cognitive Science Society (pp. 130–135). Austin: Cognitive Science Society.
Google Scholar
Basu, R. (2019a). The wrongs of racist beliefs. Philosophical Studies, 176(9), 2497–2515.
Article Google Scholar
Basu, R. (2019b). What we epistemically owe to each other. Philosophical Studies, 176(4), 915–931.
Article Google Scholar
Beeghly, E. (2015). What is a stereotype? What is stereotyping? Hypatia, 30(4), 675–691.
Article Google Scholar
Blum, L. (2004). Stereotypes and stereotyping: A moral analysis. Philosophical Papers, 33(3), 251–289.
Article Google Scholar
Bolinger, R. J. (2018). The rational impermissibility of accepting (some) racial generalizations. Synthese, 1–17.
Byrd, N. (2019). What we can (and cant) infer about implicit bias from debiasing experiments. Synthese, 1–29.
Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183–186.
Article Google Scholar
Chouldechova, A. (2016). Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. arXiv preprint arXiv:1610.07524.
Corneille, O. & Hutter, M. (2020). Implicit? What do you mean? A comprehensive review of the delusive implicitness construct in attitude research. Personality and Social Psychology Review, 108886832091132.
Datta, A., Tschantz, M. C., & Datta, A. (2015). Automated Experiments on Ad Privacy Settings. In Proceedings on privacy enhancing technologies (Vol. 2015(1)).
Daum III, H. (2015). A Course in Machine Learning. https://ciml.info/.
Eberhardt, J. L., Goff, P. A., Purdie, V. J., & Davies, P. G. (2004). Seeing black: Race, crime, and visual processing. Journal of Personality and Social Psychology, 87(6), 876–893.
Article Google Scholar
Epstein, A. (2016). Fox News’s biggest problem isn’t the Ailes ouster, it’s that it’s average viewer is a dinosaur. New York: Quartz Media.
Google Scholar
Eubanks, V. (2018). Automating inequality: How high-tech tools profile, police and punish the poor. New York: St. Martin’s Press.
Google Scholar
Gawronski, B., Hofmann, W., & Wilbur, C. J. (2006). Are implicit attitudes unconscious? Consciousness and Cognition, 15(3), 485–499.
Article Google Scholar
Gendler, T. S. (2011). On the epistemic costs of implicit bias. Philosophical Studies, 156(1), 33–63.
Article Google Scholar
Hahn, A., Judd, C. M., Hirsh, H. K., & Blair, I. V. (2014). Awareness of implicit attitudes. Journal of Experimental Psychology: General, 143(3), 1369–1392.
Article Google Scholar
Haslanger, S. (2015). Social structure, narrative, and explanation. Canadian Journal of Philosophy, 45(1), 1–15.
Article Google Scholar
Haslanger, S. (2016a). Comments on Alex Madvas ‘A plea for anti-anti-individualism: How oversimple psychology mislead social policy’. In Ergo symposium.
Haslanger, S. (2016b). What is a (social) structural explanation? Philosophical Studies, 173(1), 113–130.
Article Google Scholar
Hellman, D. (2019). Measuring Algorithmic Fairness. Virginia Public Law and Legal Theory Research Paper, 2019, 39.
Google Scholar
Holroyd, J., Scaife, R., & Stafford, T. (2017). What is implicit bias? Philosophy Compass, 12(10), e12437.
Article Google Scholar
Holroyd, J., & Sweetman, J. (2016). The heterogeneity of implicit bias. In M. Brownstein & J. Saul (Eds.), Implicit bias and philosophy volume 1: metaphysics and epistemology (pp. 80–103). Oxford: Oxford University Press.
Chapter Google Scholar
Huebner, B. (2016). Implicit bias, reinforcement learning, and scaffolded moral cognition. In Brownstein, M. and Saul, J. (Eds.), Implicit bias and philosophy volume 1: Metaphysics and epistemology (pp. 47–79). Oxford: Oxford University Press (Forthcoming in Brownstein and Saul, eds. Implicit Bias and Philosophy Volume I: Metaphysics and Epistemology. Oxford: Oxford University Press)
Jennings, C., & Schwitzgebel, E. (2017). Women in philosophy: Quantitative analyses of specialization, prevalence, visibility, and generational change. Public Affairs Quarterly, 31, 83–105.
Google Scholar
Johnson, G. M. (2020). The structure of bias. Mind. https://doi.org/10.1093/mind/fzaa011.
Kahneman, D., & Frederick, S. (2002). Representativeness revisited: Attribute substitution in intuitive judgment. In T. Gilovich, D. Griffin, & D. Kahneman (Eds.), Heuristics and Biases: The psychology of intuitive judgement (1st ed., pp. 49–81). Cambridge: Cambridge University Press.
Chapter Google Scholar
Kings, A. E. (2019). Philosophys diversity problem: Understanding the underrepresentation of women and minorities in philosophy. Metaphilosophy, 50(3), 212–230.
Article Google Scholar
Klare, B. F., Burge, M. J., Klontz, J. C., Vorder Bruegge, R. W., & Jain, A. K. (2012). Face recognition performance: Role of demographic information. IEEE Transactions on Information Forensics and Security, 7(6), 1789–1801.
Article Google Scholar
Kleinberg, J., Mullainathan, S., & Raghavan, M. (2016). Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807.
Lowry, S., & Macpherson, G. (1988). A blot on the profession. British Medical Journal (Clinical Research ed.), 296(6623), 657.
Article Google Scholar
Madva, A. (2016). A plea for anti-anti-individualism: How oversimple psychology misleads social policy. Ergo, an Open Access Journal of Philosophy, 3, 701.
Article Google Scholar
Miconi, T. (2017). The impossibility of “fairness”: a generalized impossibility result for decisions. arXiv:1707.01195.
Monin, B. (2003). The warm glow heuristic: When liking leads to familiarity. Journal of Personality and Social Psychology, 85(6), 1035–1048.
Article Google Scholar
Munton, J. (2019a). Beyond accuracy: Epistemic flaws with statistical generalizations. Philosophical Issues, 29(1), 228–240.
Article Google Scholar
Munton, J. (2019b). Bias in a biased system: Visual perceptual prejudice. In Bias, reason and enquiry: New perspectives from the crossroads of epistemology and psychology. Oxford: Oxford University Press.
Nagel, M., Peppers-Bates, S., Leuschner, A., & Lindemann, A. (2018). Feminism and philosophy. The American Philosophical Association, 17(2), 33.
Google Scholar
Narayanan, A. (2016). Language necessarily contains human biases, and so will machines trained on language corpora. Freedom to Tinker. https://freedom-totinker.com/2016/08/24/language-necessarily-contains-human-biases-and-so-will-machines-trained-on-language-corpora/.
O’Neil, C. (2016). Weapons of math destruction: How big data increases inequality and threatens democracy. New York: Crown Publishing Group.
Google Scholar
Paxton, M., Figdor, C., & Tiberius, V. (2012). Quantifying the gender gap: An empirical study of the underrepresentation of women in philosophy. Hypatia, 27(4), 949–957.
Article Google Scholar
Price, R. (2016). Microsoft is deleting its AI chatbot’s incredibly racist tweets. New York: Business Insider.
Google Scholar
Saul, J. (2013). Implicit bias, stereotype threat, and women in philosophy. In K. Hutchison & F. Jenkins (Eds.), Women in philosophy: What needs to change? (pp. 39–60). Oxford: Oxford University Press.
Chapter Google Scholar
Soon, V. (2019). Implicit bias and social schema: A transactive memory approach. Philosophical Studies, 1–21.
Stephens-Davidowitz, S. (2014). Opinion|Google, tell me. Is my son a genius?. New York: The New York Times.
Google Scholar
Stewart, A. J., & Valian, V. (2018). An inclusive academy: Achieving diversity and excellence. Cambridge: MIT Press.
Book Google Scholar
Wilhelm, I., Conklin, S. L., & Hassoun, N. (2018). New data on the representation of women in philosophy journals: 20042015. Philosophical Studies, 175(6), 1441–1464.
Article Google Scholar
Wu, X. & Zhang, Z. (2016). Automated inference on criminality using face images. arXiv preprint arXiv:1611.04135.

Download references

Author information

Authors and Affiliations

New York University, New York, USA
Gabbrielle M. Johnson

Authors

Gabbrielle M. Johnson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gabbrielle M. Johnson.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Johnson, G.M. Algorithmic bias: on the implicit biases of social technology. Synthese 198, 9941–9961 (2021). https://doi.org/10.1007/s11229-020-02696-y

Download citation

Received: 18 January 2020
Accepted: 11 May 2020
Published: 20 June 2020
Issue Date: October 2021
DOI: https://doi.org/10.1007/s11229-020-02696-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Algorithmic bias: on the implicit biases of social technology

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Plagiarism in research

A random forest guided tour

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Algorithmic bias: on the implicit biases of social technology

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Plagiarism in research

A random forest guided tour

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation