Skip to main content

Machine Learning and Irresponsible Inference: Morally Assessing the Training Data for Image Recognition Systems

  • Chapter
  • First Online:
Book cover On the Cognitive, Ethical, and Scientific Dimensions of Artificial Intelligence

Part of the book series: Philosophical Studies Series ((PSSP,volume 134))

Abstract

Just as humans can draw conclusions responsibly or irresponsibly, so too can computers. Machine learning systems that have been trained on data sets that include irresponsible judgments are likely to yield irresponsible predictions as outputs. In this paper I focus on a particular kind of inference a computer system might make: identification of the intentions with which a person acted on the basis of photographic evidence. Such inferences are liable to be morally objectionable, because of a way in which they are presumptuous. After elaborating this moral concern, I explore the possibility that carefully procuring the training data for image recognition systems could ensure that the systems avoid the problem. The lesson of this paper extends beyond just the particular case of image recognition systems and the challenge of responsibly identifying a person’s intentions. Reflection on this particular case demonstrates the importance (as well as the difficulty) of evaluating machine learning systems and their training data from the standpoint of moral considerations that are not encompassed by ordinary assessments of predictive accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A few clarifications about the relationship between stereotypes and presumptuous judgment may be helpful. First, not all cases of presumptuous judgment involve stereotypes. Stereotypes involve associating an individual with a group (Blum 2004; Beeghly 2015). But it is possible to make a presumptuous judgment without relying on a group association. For instance, I might make a presumptuous judgment about a person’s intentions just on the basis of the assumption that her goals are the same as my own. Second, not all uses of stereotypes involve presumptuous judgments. This is simply because not all stereotypes are about persons’ intentions. Finally, regarding the moral features of stereotypes and presumptuous judgments: Presumptuousness, all else equal, tends to be morally undesirable, but it’s controversial whether this is true of all stereotypes. Beeghly (2015) argues that not all stereotyping is morally objectionable, and Lippmann (1922) saw positive and negative aspects of stereotyping. In contrast, Blum (2004) holds that stereotyping is always morally objectionable to some degree. My contention here, that presumptuous judgments manifest inadequate respect for persons as individuals, is consistent with Beeghly’s explanation of when and how stereotypes fail to respect persons as individuals. However, my thinking about why such a failure of respect is morally objectionable shares more with Blum’s analysis than with Beeghly’s. In the context of the present paper—with its focus on the moral evaluation of training data for machine learning systems—it is enough for my purposes if at least some judgments are morally objectionable precisely because of their presumptuousness.

  2. 2.

    As Hodosh et al. (2013) point out, “Gricean maxims of relevance and quantity entail that image captions that are written for people usually provide precisely the kind of information that could not be obtained from the image itself, and thus tend to bear only a tenuous relation to what is actually depicted.”

  3. 3.

    Though crowdwork raises ethical issues of its own (Marvit 2014).

  4. 4.

    This comes from the online appendix to Hodosh et al. (2013).

  5. 5.

    This suggests another way to explain what is wrong with presumptuous judgment. To judge a person’s mental states according to a standard like we would use for any other sort of judgment not involving persons, is to take what Peter Strawson (1962) called the “objective attitude” rather than the “participant attitude” toward the person.

  6. 6.

    Of course, the image recognition system could report the falling water, and we could rely on some other process to infer from the falling water that it must be raining. But this would be to limit too much the capacities of image recognition systems. A scene can be one that looks rainy, and looking rainy may be both more intuitive and more useful information than the report that it looks like water is falling from above.

  7. 7.

    There’s nothing special about the specific probability values of 0.02 and 0.98, besides the former being small and the latter being large. These values are just convenient for purposes of illustration. Values of 0.01 and 0.99 or 0.05 and 0.95 would have worked just as well (although values that were too extreme or too moderate would indeed alter the examples).

  8. 8.

    I do not intend this as a criticism of the Flickr 8k data set. Violations of the instruction I am recommending seem to appear only rarely in the data set. However, this image and the next are valuable for illustrating the worry I that is my focus.

  9. 9.

    I do not mean to imply that Dennett himself is guilty of making this assumption.

  10. 10.

    Cf. Blum (2004).

  11. 11.

    And, of course, a further worry about this strategy concerns the thorny issue about how we might go about categorizing intentions as attractive or unattractive in the first place.

  12. 12.

    Such work is already underway. See, e.g., Park et al. (unpublished ms).

  13. 13.

    Along these lines, Dennett argues, “the class of indistinguishably satisfactory models of the formal system embodied in [the] internal states [of an entity toward which we might take the intentional stance] gets smaller and smaller as we add such complexities [such as a wider range of behaviors]; the more we add, the richer or more demanding or specific the semantics of the system, until eventually we reach systems for which a unique semantic interpretation is practically (but never in principle) dictated” (1989b). Notoriously, according to both Quine and Davidson, some indeterminacy may be ineliminable. However, along with Dennett, I doubt that any remaining indeterminacy poses any practical or ethical problems in the context of machine learning systems. For discussion of indeterminacy and its (in)significance, see Davidson (1984b).

  14. 14.

    This is a specific version of the type of problem James Moor (1985) has famously called “invisibility.”

References

  • Beeghly, Erin. 2015. What is a stereotype? What is stereotyping? Hypatia 30 (4): 675–691.

    Article  Google Scholar 

  • Blum, Lawrence. 2004. Stereotypes and stereotyping: A moral analysis. Philosophical Papers 33 (3): 251–289.

    Article  MathSciNet  Google Scholar 

  • Davidson, Donald. 1984a. Inquiries into truth and interpretation. Oxford: Clarendon Press.

    Google Scholar 

  • ———. 1984b. Belief and the basis of meaning. Reprinted in Davidson (1984a): 141–154.

    Google Scholar 

  • ———. 2004a. Problems of rationality. Oxford: Clarendon Press.

    Book  Google Scholar 

  • ———. 2004b. Expressing evaluations. Reprinted in Davidson (2004a): 19–37.

    Google Scholar 

  • Dennett, Daniel. 1989a. The intentional stance. Cambridge, MA: MIT Press.

    Google Scholar 

  • ———. 1989b. True believers. Reprinted in Dennet (1989a): 13–35.

    Google Scholar 

  • Fei-Fei, Li, and Li-Jia Li. 2010. What, where and who? telling the story of an image by activity classification, scene recognition and object categorization. In Computer vision, ed. Cipolla et al., 157–171. Berlin: Springer.

    Chapter  Google Scholar 

  • Hodosh, Micah, Peter Young, and Julia Hockenmaier. 2013. Framing image description as a ranking task: Data, models and evaluation metrics. Journal of Artificial Intelligence Research 47: 853–899.

    Article  MathSciNet  Google Scholar 

  • Karpathy, Andrej, and Li Fei-Fei. 2014. Deep visual-semantic alignments for generating image descriptions. arXiv preprint arXiv:1412.2306.

    Google Scholar 

  • Lippmann, Walter. 1922. Public opinion. New York: Macmillan.

    Google Scholar 

  • Marvit, Moshe. 2014. How crowdworkers became the ghosts in the digital machine. The Nation. http://www.thenation.com/article/how-crowdworkers-became-ghosts-digital-machine/. Accessed 11 Jan 2016.

  • Moor, James. 1985. What is computer ethics? Metaphilosophy 16 (4): 266–275.

    Article  Google Scholar 

  • Park, Eunbyung, Xufeng Han, Tamara Berg, and Alexander Berg. (unpublishedms). Combining multiple sources of knowledge in deep CNNs for action recognition. http://www.cs.unc.edu/~eunbyung/papers/wacv2016_combining.pdf. Accessed 11 Jan 2016.

  • Quine, W.V. 1960. Word and object. Cambridge, MA: MIT press.

    MATH  Google Scholar 

  • Rashtchian, Cyrus, Peter Young, Micah Hodosh, and Julia Hockenmaier. 2010. Collecting image annotations using Amazon’s Mechanical Turk. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, 139–147. Association for Computational Linguistics.

    Google Scholar 

  • Strawson, Peter. 1962. Freedom and resentment. Proceedings of the British Academy 48: 1–25.

    Article  Google Scholar 

  • Vinyals, Oriol, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2014. Show and tell: A neural image caption generator. arXiv preprint arXiv:1411.4555.

    Google Scholar 

Download references

Acknowledgments

I am grateful to Andréa Atkins and to attendees of the IACAP 2016 for discussion of these issues. A preliminary exposition of some of the ideas and arguments presented in this chapter appeared in a short essay posted on the website of the Loyola Center for Digital Ethics and Policy (http://www.digitalethics.org/).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Owen C. King .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

King, O.C. (2019). Machine Learning and Irresponsible Inference: Morally Assessing the Training Data for Image Recognition Systems. In: Berkich, D., d'Alfonso, M. (eds) On the Cognitive, Ethical, and Scientific Dimensions of Artificial Intelligence. Philosophical Studies Series, vol 134. Springer, Cham. https://doi.org/10.1007/978-3-030-01800-9_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01800-9_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01799-6

  • Online ISBN: 978-3-030-01800-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics