Dangers of Bias in Data-Intensive Information Systems

  • Baekkwan Park
  • Dhana L. Rao
  • Venkat N. GudivadaEmail author
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1162)


Data-intensive information systems (DIS) are pervasive and virtually affect people in all walks of life. Artificial intelligence and machine learning technologies are the backbone of DIS systems. Various types of biases embedded into DIS systems have serious significance and implications for individuals as well as the society at large. In this paper, we discuss various types of bias—both human and machine—and suggest ways to eliminate or minimize it. We also make a case for digital ethics education and outline ways to incorporate such education into computing curricula.


Human bias Algorithmic bias Information systems Digital ethics 


  1. 1.
    Acquisti, A., Gross, R., Stutzman, F.: Faces of facebook: Privacy in the age of augmented reality. BlackHat USA 2, 1–20 (2011)Google Scholar
  2. 2.
    Alarie, B.: The path of the law: towards legal singularity. Univ. Toronto Law J. 66(4), 443–455 (2016)CrossRefGoogle Scholar
  3. 3.
    Andrejevic, M.: Digital citizenship and surveillance| to pre-empt a thief. Int. J. Commun. 11, 18 (2017)Google Scholar
  4. 4.
    Asur, S., Huberman, B.A.: Predicting the future with social media. In: Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology-Volume 01, pp. 492–499. IEEE Computer Society (2010)Google Scholar
  5. 5.
    Bakke, E.: Predictive policing: the argument for public transparency. NYU Ann. Surv. Am. L. 74, 131 (2018)Google Scholar
  6. 6.
    Bakshy, E., Messing, S., Adamic, L.A.: Exposure to ideologically diverse news and opinion on facebook. Science 348(6239), 1130–1132 (2015)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Barocas, S., Boyd, D.: Engaging the ethics of data science in practice. Commun. ACM 60(11), 23–25 (2017)CrossRefGoogle Scholar
  8. 8.
    Barocas, S., Selbst, A.D.: Big data’s disparate impact. Calif. L. Rev. 104, 671 (2016)Google Scholar
  9. 9.
    Burrell, J.: How the machine ‘thinks’: understanding opacity in machine learning algorithms. Big Data Soc. 3(1), 2053951715622512 (2016)CrossRefGoogle Scholar
  10. 10.
    Cadwalladr, C., Graham-Harrison, E.: The Cambridge analytics files. The Guardian (2018)Google Scholar
  11. 11.
    Camacho-Collados, M., Liberatore, F.: A decision support system for predictive police patrolling. Decis. Support Syst. 75, 25–37 (2015)CrossRefGoogle Scholar
  12. 12.
    Chandler, S.: The AI chatbot will hire you now. (2017)Google Scholar
  13. 13.
    Chen, W., Quan-Haase, A.: Big data ethics and politics: Toward new understandings. Soc. Sci. Comput. Rev., p. 0894439318810734 (2018)Google Scholar
  14. 14.
    Datta, A., Sen, S., Tschantz, M.C.: Correspondences between privacy and nondiscrimination: why they should be studied together. arXiv preprint arXiv:1808.01735 (2018)
  15. 15.
    Eubanks, V.: Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. St. Martin’s Press, New York, NY (2018)Google Scholar
  16. 16.
    Fang, H., Moro, A.: Theories of statistical discrimination and affirmative action: a survey. In: Benhabib, J., Jackson, M.O., Bisin, A. (eds.) Handbook of Social Economics, vol. 1a (2011)Google Scholar
  17. 17.
    Fry, H.: Hello World: Being Human in the Age of Algorithms. WW Norton & Company (2018)Google Scholar
  18. 18.
    Gandomi, A., Haider, M.: Beyond the hype: Big data concepts, methods, and analytics. Int. J. Inf. Manage. 35(2), 137–144 (2015)CrossRefGoogle Scholar
  19. 19.
    Gayo-Avello, D.: A meta-analysis of state-of-the-art electoral prediction from twitter data. Soc. Sci. Comput. Rev. 31(6), 649–679 (2013)CrossRefGoogle Scholar
  20. 20.
    Glaberson, S.K.: Coding over the cracks: predictive analytics and child protection. Fordham Urb. LJ 46, 307 (2019)Google Scholar
  21. 21.
    Goffman, A.: On the run: fugitive life in an American city. Picador (2015)Google Scholar
  22. 22.
    Gudivada, V., Apon, A., Ding, J.: Data quality considerations for big data and machine learning: going beyond data cleaning and transformations. Int. J. Adv. Softw. 10(1), 1–20 (2017)Google Scholar
  23. 23.
    Gudivada, V.N., Ramaswamy, S., Srinivasan, S.: Data management issues in cyber-physical systems. In: Transportation Cyber-Physical Systems, pp. 173–200. Elsevier (2018)Google Scholar
  24. 24.
    Guerette, R.T., Bowers, K.J.: Assessing the extent of crime displacement and diffusion of benefits: a review of situational crime prevention evaluations. Criminology 47(4), 1331–1368 (2009)CrossRefGoogle Scholar
  25. 25.
    Hamilton, M.: The biased algorithm: evidence of disparate impact on hispanics. Am. Crim. L. Rev. 56, 1553 (2019)Google Scholar
  26. 26.
    Hargittai, E.: Is bigger always better? potential biases of big data derived from social network sites. Ann. Am. Acad. Polit. Soc. Sci. 659(1), 63–76 (2015)CrossRefGoogle Scholar
  27. 27.
    Hersch, J., Shinall, J.B.: Something to talk about: Information exchange under employment law. U. Pa. L. Rev. 165, 49 (2016)Google Scholar
  28. 28.
    Kleinberg, J.: Inherent trade-offs in algorithmic fairness. In: ACM SIGMETRICS Performance Evaluation Review, vol. 46, pp. 40–40. ACM (2018)Google Scholar
  29. 29.
    Kroll, J.A., Barocas, S., Felten, E.W., Reidenberg, J.R., Robinson, D.G., Yu, H.: Accountable algorithms. U. Pa. L. Rev. 165, 633 (2016)Google Scholar
  30. 30.
    Labrinidis, A., Jagadish, H.V.: Challenges and opportunities with big data. Proc. VLDB Endowment 5(12), 2032–2033 (2012)CrossRefGoogle Scholar
  31. 31.
    Lazer, D., Pentland, A., Adamic, l., Aral, S., Barabasi, A.L., Brewer, D., Christakis, N., Contractor, N., Fowler, J., Gutmann, M., Jebara, T., King, G., Macy, M., Roy, D., Van Alstyne, M.: Computational social science. Science 323(5915), 721–723 (2009)Google Scholar
  32. 32.
    Levine, E., Tisch, J., Tasso, A., Joy, M.: The New York city police department’s domain awareness system. Interfaces 47(1), 70–84 (2017)CrossRefGoogle Scholar
  33. 33.
    Lipton, Z.C.: The mythos of model interpretability. arXiv preprint arXiv:1606.03490 (2016)
  34. 34.
    Madden, M., Gilman, M., Levy, K., Marwick, A.: Privacy, poverty, and big data: a matrix of vulnerabilities for poor americans. Wash. UL Rev. 95, 53 (2017)Google Scholar
  35. 35.
    Markham, A.N., Tiidenberg, K., Herman, A.: Ethics as methods: doing ethics in the era of big data research-introduction. Social Media Soc. 4(3), 2056305118784502 (2018). Scholar
  36. 36.
    Meijer, A., Wessels, M.: Predictive policing: Review of benefits and drawbacks. Int. J. Pub. Adm., pp. 1–9 (2019)Google Scholar
  37. 37.
    Noble, S.: Algorithms of Oppression: How Search Engines Reinforce Racism. NYU Press, New York, NY (2018)CrossRefGoogle Scholar
  38. 38.
    O’Connor, B., Balasubramanyan, R., Routledge, B.R., Smith, N.A.: From tweets to polls: linking text sentiment to public opinion time series. In: Fourth International AAAI Conference on Weblogs and Social Media (2010)Google Scholar
  39. 39.
    O’Neil, C.: Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Crown Publishing Group, New York, NY (2016)zbMATHGoogle Scholar
  40. 40.
    Oswald, M., Babuta, A.: Data Analytics and Algorithmic Bias in Policing (2019)Google Scholar
  41. 41.
    Pasquale, F.: The Black Box Society. Harvard University Press, Cambridge (2015)CrossRefGoogle Scholar
  42. 42.
    Passi, S., Barocas, S.: Problem formulation and fairness. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 39–48. ACM (2019)Google Scholar
  43. 43.
    Pearsall, B.: Predictive policing: the future of law enforcement. Nat. Inst. Justice J. 266(1), 16–19 (2010)Google Scholar
  44. 44.
    Schlehahn, E., Wenning, R.: Gdpr transparency requirements and data privacy vocabularies. In: IFIP International Summer School on Privacy and Identity Management, pp. 95–113. Springer (2018)Google Scholar
  45. 45.
    Schwartz, H.A., Eichstaedt, J.C., Kern, M.L., Dziurzynski, L., Ramones, S.M., Agrawal, M., Shah, A., Kosinski, M., Stillwell, D., Seligman, M.E., et al.: Personality, gender, and age in the language of social media: the open-vocabulary approach. PloS one 8(9), e73791 (2013)CrossRefGoogle Scholar
  46. 46.
    Shahin, S., Zheng, P.: Big data and the illusion of choice: Comparing the evolution of India’s aadhaar and China’s social credit system as technosocial discourses. Soc. Sci. Comput. Rev., p. 0894439318789343 (2018)Google Scholar
  47. 47.
    Silva, S., Kenney, M.: Algorithms, platforms, and ethnic bias: an integrative essay. Phylon (1960-) 55(1 & 2), 9–37 (2018)Google Scholar
  48. 48.
    Stern, M.J., Bilgen, I., McClain, C., Hunscher, B.: Effective sampling from social media sites and search engines for web surveys: demographic and data quality differences in surveys of google and facebook users. Soc. Sci. Comput. Rev. 35(6), 713–732 (2017)CrossRefGoogle Scholar
  49. 49.
    Strahilevitz, L.J.: Reputation nation: law in an era of ubiquitous personal information. Nw. UL Rev. 102, 1667 (2008)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2021

Authors and Affiliations

  • Baekkwan Park
    • 1
  • Dhana L. Rao
    • 2
  • Venkat N. Gudivada
    • 3
    Email author
  1. 1.Center for Survey ResearchEast Carolina UniversityGreenvilleUSA
  2. 2.Department of BiologyEast Carolina UniversityGreenvilleUSA
  3. 3.Department of Computer ScienceEast Carolina UniversityGreenvilleUSA

Personalised recommendations