Advertisement

Algorithmic Transparency via Quantitative Input Influence

  • Anupam DattaEmail author
  • Shayak Sen
  • Yair Zick
Chapter
Part of the Studies in Big Data book series (SBD, volume 32)

Abstract

Algorithmic systems that employ machine learning are often opaque—it is difficult to explain why a certain decision was made. We present a formal foundation to improve the transparency of such decision-making systems. Specifically, we introduce a family of Quantitative Input Influence (QII) measures that capture the degree of input influence on system outputs. These measures provide a foundation for the design of transparency reports that accompany system decisions (e.g., explaining a specific credit decision) and for testing tools useful for internal and external oversight (e.g., to detect algorithmic discrimination). Distinctively, our causal QII measures carefully account for correlated inputs while measuring influence. They support a general class of transparency queries and can, in particular, explain decisions about individuals and groups. Finally, since single inputs may not always have high influence, the QII measures also quantify the joint influence of a set of inputs (e.g., age and income) on outcomes (e.g. loan decisions) and the average marginal influence of individual inputs within such a set (e.g., income) using principled aggregation measures, such as the Shapley value, previously applied to measure influence in voting.

Keywords

Classification Outcome Cooperative Game Marginal Contribution Data Analytic System Influence Measure 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Adler, P., Falk, C., Friedler, S., Rybeck, G., Schedegger, C., Smith, B., Venkatasubramanian, S.: Auditing black-box models for indirect influence. In: Proceedings of the 2016 IEEE International Conference on Data Mining (ICDM), ICDM’16, pp. 339–348. IEEE Computer Society, Washington (2016)Google Scholar
  2. 2.
    Alloway, T.: Big data: Credit where credit’s due (2015). http://www.ft.com/cms/s/0/7933792e-a2e6-11e4-9c06-00144feab7de.html
  3. 3.
    Barford, P., Canadi, I., Krushevskaja, D., Ma, Q., Muthukrishnan, S.: Adscape: harvesting and analyzing online display ads. In: Proceedings of the 23rd International Conference on World Wide Web, WWW’14, pp. 597–608. ACM, New York (2014)Google Scholar
  4. 4.
    Barocas, S., Nissenbaum, H.: Big data’s end run around procedural privacy protections. Commun. ACM 57 (11), 31–33 (2014)CrossRefGoogle Scholar
  5. 5.
  6. 6.
    Big data in government, defense and homeland security 2015–2020 (2015). http://www.prnewswire.com/news-releases/big-data-in-government-defense-and-homeland-security-2015---2020.html
  7. 7.
    Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, New York (2006)zbMATHGoogle Scholar
  8. 8.
    Bork, P., Jensen, L., von Mering, C., Ramani, A., Lee, I., Marcott, E.: Protein interaction networks from yeast to human. Curr. Opin. Struct. Biol. 14 (3), 292–299 (2004)CrossRefGoogle Scholar
  9. 9.
    Breiman, L.: Random forests. Mach. Learn. 45 (1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  10. 10.
    Calders, T., Verwer, S.: Three naive Bayes approaches for discrimination-free classification. Data Min. Knowl. Disc. 21 (2), 277–292 (2010)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, New York (2012)zbMATHGoogle Scholar
  12. 12.
    Datta, A., Tschantz, M., Datta, A.: Automated experiments on ad privacy settings: a tale of opacity, choice, and discrimination. In: Proceedings on Privacy Enhancing Technologies (PoPETs 2015), pp. 92–112 (2015)Google Scholar
  13. 13.
    Datta, A., Datta, A., Procaccia, A., Zick, Y.: Influence in classification via cooperative game theory. In: Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI 2015), pp. 511–517 (2015)Google Scholar
  14. 14.
    Datta, A., Sen, S., Zick, Y.: Algorithmic transparency via quantitative input influence: theory and experiments with learning systems. In: Proceedings of 37th Symposium on Security and Privacy (Oakland 2016), pp. 598–617 (2016)Google Scholar
  15. 15.
    Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference (ITCS 2012), pp. 214–226 (2012)Google Scholar
  16. 16.
    E.G. Griggs v. Duke Power Co., 401 U.S. 424, 91 S. Ct. 849, 28 L. Ed. 2d 158 (1977)Google Scholar
  17. 17.
    Feldman, M., Friedler, S.A., Moeller, J., Scheidegger, C., Venkatasubramanian, S.: Certifying and removing disparate impact. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’15, pp. 259–268. ACM, New York (2015)Google Scholar
  18. 18.
    Guha, S., Cheng, B., Francis, P.: Challenges in measuring online advertising systems. In: Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement, IMC’10, pp. 81–87. ACM, New York (2010)Google Scholar
  19. 19.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)zbMATHGoogle Scholar
  20. 20.
    Janzing, D., Balduzzi, D., Grosse-Wentrup, M., Schölkopf, B.: Quantifying causal influences. Ann. Statist. 41 (5), 2324–2358 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Jelveh, Z., Luca, M.: Towards diagnosing accuracy loss in discrimination-aware classification: an application to predictive policing. In: Fairness, Accountability and Transparency in Machine Learning, pp. 137–141 (2014)Google Scholar
  22. 22.
    Kamishima, T., Akaho, S., Sakuma, J.: Fairness-aware learning through regularization approach. In: Proceedings of the 2011 IEEE 11th International Conference on Data Mining Workshops (ICDMW 2011), pp. 643–650 (2011)Google Scholar
  23. 23.
    Keinan, A., Sandbank, B., Hilgetag, C., Meilijson, I., Ruppin, E.: Fair attribution of functional contribution in artificial and biological networks. Neural Comput. 16 (9), 1887–1915 (2004)CrossRefzbMATHGoogle Scholar
  24. 24.
    Lécuyer, M., Ducoffe, G., Lan, F., Papancea, A., Petsios, T., Spahn, R., Chaintreau, A., Geambasu, R.: Xray: enhancing the web’s transparency with differential correlation. In: Proceedings of the 23rd USENIX Conference on Security Symposium, SEC’14, pp. 49–64. USENIX Association, Berkeley (2014)Google Scholar
  25. 25.
    Lecuyer, M., Spahn, R., Spiliopolous, Y., Chaintreau, A., Geambasu, R., Hsu, D.: Sunlight: Fine-grained targeting detection at scale with statistical confidence. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, CCS’15, pp. 554–566. ACM, New York (2015)Google Scholar
  26. 26.
    Letham, B., Rudin, C., McCormick, T.H., Madigan, D.: Interpretable classifiers using rules and Bayesian analysis: building a better stroke prediction model. Ann. Appl. Stat. 9 (3), 1350–1371 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  27. 27.
    Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml Google Scholar
  28. 28.
    Lindelauf, R., Hamers, H., Husslage, B.: Cooperative game theoretic centrality analysis of terrorist networks: the cases of Jemaah Islamiyah and Al Qaeda. Eur. J. Oper. Res. 229 (1), 230–238 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Maschler, M., Solan, E., Zamir, S.: Game Theory. Cambridge University Press, Cambridge (2013)CrossRefzbMATHGoogle Scholar
  30. 30.
    Michalak, T., Rahwan, T., Szczepanski, P., Skibski, O., Narayanam, R., Wooldridge, M., Jennings, N.: Computational analysis of connectivity games with applications to the investigation of terrorist networks. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI 2013), pp. 293–301 (2013)Google Scholar
  31. 31.
    Murdoch, T.B., Detsky, A.S.: The inevitable application of big data to health care. http://jama.jamanetwork.com/article.aspx?articleid=1674245
  32. 32.
    National longitudinal surveys (2017). http://www.bls.gov/nls/
  33. 33.
    O’Donnell, R.: Analysis of Boolean Functions. Cambridge University Press, New York (2014)CrossRefzbMATHGoogle Scholar
  34. 34.
    Perry, W.L., McInnis, B., Price, C.C., Smith, S.C., Hollywood, J.S.: Predictive policing: the role of crime forecasting in law enforcement operations. RAND Corporation, Santa Monica (2013)Google Scholar
  35. 35.
    Podesta, J., Pritzker, P., Moniz, E., Holdern, J., Zients, J.: Big data: seizing opportunities, preserving values. Technical Report, Executive Office of the President - the White House (2014)Google Scholar
  36. 36.
    Rényi, A.: On measures of entropy and information. In: Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics, pp. 547–561. University of California Press, Berkeley (1961)Google Scholar
  37. 37.
    Rüping, S.: Learning interpretable models. Ph.D. Thesis, Dortmund University of Technology (2006). http://d-nb.info/997491736
  38. 38.
    Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27 (3), 379–423 (1948)MathSciNetCrossRefzbMATHGoogle Scholar
  39. 39.
    Shapley, L.: A value for n-person games. In: Contributions to the Theory of Games, vol. 2, Annals of Mathematics Studies, No. 28, pp. 307–317. Princeton University Press, Princeton (1953)Google Scholar
  40. 40.
    Shapley, L.S., Shubik, M.: A method for evaluating the distribution of power in a committee system. Am. Polit. Sci. Rev. 48 (3), 787–792 (1954)CrossRefGoogle Scholar
  41. 41.
    Smith, G.: Quantifying information flow using min-entropy. In: Proceedings of the 8th International Conference on Quantitative Evaluation of Systems (QEST 2011), pp. 159–167 (2011)Google Scholar
  42. 42.
    Strumbelj, E., Kononenko, I.: An efficient explanation of individual classifications using game theory. J. Mach. Learn. Res. 11, 1–18 (2010)MathSciNetzbMATHGoogle Scholar
  43. 43.
    The National Center for Fair and Open Testing: 850+ colleges and universities that do not use SAT/ACT scores to admit substantial numbers of students into bachelor degree programs (2015). http://www.fairtest.org/university/optional
  44. 44.
    Tibshirani, R.: Regression shrinkage and selection via the lasso: a retrospective. J. R. Stat. Soc. Ser. B 73 (3), 273–282 (2011)MathSciNetCrossRefGoogle Scholar
  45. 45.
    University, G.W.: Standardized test scores will be optional for GW applicants (2015). https://gwtoday.gwu.edu/standardized-test-scores-will-be-optional-gw-applicants
  46. 46.
    Ustun, B., Tracà, S., Rudin, C.: Supersparse linear integer models for interpretable classification. ArXiv e-prints (2013). http://arxiv.org/pdf/1306.5860v1
  47. 47.
    Young, H.: Monotonic solutions of cooperative games. Int. J. Game Theory 14 (2), 65–72 (1985)MathSciNetCrossRefzbMATHGoogle Scholar
  48. 48.
    Zemel, R., Wu, Y., Swersky, K., Pitassi, T., Dwork, C.: Learning fair representations. In: Proceedings of the 30th International Conference on Machine Learning (ICML 2013), pp. 325–333 (2013)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Carnegie Mellon UniversityPittsburghUSA
  2. 2.School of Computing, National University of SingaporeSingaporeSingapore

Personalised recommendations