Skip to main content

Responsibly Innovating Data Mining and Profiling Tools: A New Approach to Discrimination Sensitive and Privacy Sensitive Attributes

  • Chapter
  • First Online:

Abstract

Data mining is a technology that extracts useful information, such as patterns and trends, from large amounts of data. The privacy sensitive input data and the output data that is often used for selections deserve protection against abuse. In this paper we describe one of the main results of our research project on developing new privacy preserving and discrimination aware data mining tools, namely why the common measures for mitigating privacy and discrimination concerns, such as a priori limiting measures (particularly access controls, anonymity and purpose specification) are mechanisms that are increasingly failing solutions against privacy and discrimination issues in the novel context of advanced data mining and profiling. Contrary to previous attempts to protect privacy and prevent discrimination in data mining, we did not focus on new designs that better enable (a priori) access limiting measures regarding input data, but rather focused on (a posteriori) responsibility and transparency. Instead of limiting access to data, which is increasingly hard to enforce in a world of automated and interlinked databases and information networks, rather the question how data can and may be used was stressed.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://wwwis.win.tue.nl/~tcalders/dadm/doku.php

  2. 2.

    Directive 95/46/EG of the European Parliament and the Council of 24th October 1995, [1995] OJ L281/31.

  3. 3.

    Proposal for a Regulation of the European Parliament and of the Council on the protection of individuals with regard to the processing of personal data and on the free movement of such data (General Data Protection Regulation), Brussels, 25.1.2012 COM(2012) 11 final 2012/0011 (COD).

References

  • Adriaans, P., and D. Zantinge. 1996. Data mining. Harlow: Addison Wesley Longman.

    Google Scholar 

  • Bygrave, L.A. 2002. Data protection law; approaching its rationale, logic and limits, Information law series, vol. 10. The Hague/London/New York: Kluwer Law International.

    Google Scholar 

  • Calders, T. 2007. The complexity of satisfying constraints on transaction databases. Acta Informatica 44(7–8): 591–624.

    Article  Google Scholar 

  • Calders, T. 2008. Itemset frequency satisfiability: Complexity and axiomatization. Theoretical Computer Science 394(1–2): 84–111.

    Article  Google Scholar 

  • Calders, T., and S. Verwer. 2010. Three Naive Bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery, September 2010, Vol. 21, Issue 2, pp. 277–292.

    Google Scholar 

  • Chawla, N.V., K.W. Bowyer, L.O. Hall, and W.P. Kegelmeyer. 2002. Smote: Synthetic minority over-sampling technique. International Journal of Artificial Intelligence Research (JAIR) 16: 321–357.

    Google Scholar 

  • Cocx, T.K. 2009. Algorithmic tools for data-oriented law enforcement, PhD thesis, University of Leiden.

    Google Scholar 

  • Custers, B.H.M. 2004. The power of knowledge. Tilburg: Wolf Legal Publishers.

    Google Scholar 

  • Custers, B.H.M. 2010. Data mining with discrimination sensitive and privacy sensitive attributes. In Proceedings of ISP 2010, international conference on information security and privacy, 12–14, July 2010, Orlando, Florida.

    Google Scholar 

  • Custers, B., T. Calders, B. Schermer, and T. Zarsky. 2013. Discrimination and privacy in the information society; data mining and profiling in large databases. Heidelberg: Springer.

    Book  Google Scholar 

  • Del Carmen, A. 2007. Racial profiling in America. Upper Sadle River: Prentice Hall.

    Google Scholar 

  • Denning, D. 1983. Cryptography and data security. Amsterdam: Addison-Wesley.

    Google Scholar 

  • Fayyad, U.-M., G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy. 1996. Advances in knowledge discovery and data mining. Menlo Park: AAAI Press/The MIT Press.

    Google Scholar 

  • Goldberg, I.A. 2000. A pseudonymous communications infrastructure for the Internet, dissertation, Berkeley: University of California at Berkeley.

    Google Scholar 

  • Hornung, G. 2012. A general data protection regulation for Europe? Light and shade in the commission’s draft of 25 January 2012. SCRIPTed 9(1): 64–81.

    Article  Google Scholar 

  • Kamiran, F., and T. Calders. 2009. Classification without discrimination. In IEEE international conference on computer, control & communication (IEEE-IC4), 17–19 February 2009, Karachi, Pakistan.

    Google Scholar 

  • Kamiran, F., and T. Calders. 2010. Exploiting independency constraints for classification. http://wwwis.win.tue.nl

  • Kuner, Chr. (2012) The European Commission’s proposed data protection regulation: A Copernican Revolution in European Data Protection Law. Privacy and security law report, 6 February 2012.

    Google Scholar 

  • Lindell, Y., and B. Pinkas. 2002. Privacy preserving data mining. Journal of Cryptology 15(3): 177–206.

    Article  Google Scholar 

  • Mannila, H., D. Hand, and P. Smith. 2001. Principles of data mining. Cambridge, MA: MIT Press.

    Google Scholar 

  • Meeks, K. 2000. Driving while black. New York: Broadway Books.

    Google Scholar 

  • Ohm, P. 2010. Broken promises of privacy: Responding to the surprising failure of anonymization. UCLA Law Review 57: 1701.

    Google Scholar 

  • Pearl, D. 2009. Causality: Models, reasoning, and inference, 2nd ed. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Pedreschi, D., R. Ruggieri, and F. Turini. 2008. Discrimination-aware data mining. In Proceedings of the 14th ACM SIGKDD conference on knowledge discovery and data mining. New York: ACM, pp. 560–568

    Google Scholar 

  • Robinson, N., H. Graux, M. Botterman, and L. Valeri. 2009. Review of the European data protection directive. Cambridge: RAND Europe.

    Google Scholar 

  • Schermer, B.W. 2007. Software agents, surveillance, and the right to privacy: A legislative framework for agent-enabled surveillance, PhD thesis, Leiden University.

    Google Scholar 

  • Schermer, B.W. 2011. The limits of privacy in automated profiling and data mining. Computer Law & Security Review 27(7): 45–52.

    Article  Google Scholar 

  • Solove, D. 2004. The digital person; technology and privacy in the information age. New York: New York University Press.

    Google Scholar 

  • van den Hoven, M.J. 1997. Privacy and the varieties of informational wrongdoing in an information age. Computers and Society 27(2): 33–37.

    Article  Google Scholar 

  • Vedder, A.H. 1999. KDD: The challenge to individualism. Ethics and Information Technology 1(4): 275–281.

    Article  Google Scholar 

  • Weitzner, D.J., H. Abelson, et al. 2006. Transparent accountable data mining: New strategies for privacy protection, MIT technical report. Cambridge: MIT.

    Google Scholar 

  • Westin, A. 1967. Privacy and freedom. London: Bodley Head.

    Google Scholar 

  • Withrow, B. 2006. Racial profiling. Upper Sadle River: Prentice Hall.

    Google Scholar 

  • Zarsky, T.Z. 2003, Mine your own business! Making the case for the implications of the data mining of personal information in the forum of public opinion. Yale Journal of Law and Technology 5: 157.

    Google Scholar 

  • Zarsky, T.Z. 2006. Chapter 12: Online privacy, tailoring, and persuasion. In Privacy and technologies of identity, a cross disciplinary conversation, ed. K. Strandburg and D. Stan Raicu, 209–224. New York: Springer.

    Chapter  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the Netherlands Organization for Scientific Research (NWO) for enabling this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bart H. M. Custers Ph.D., M.Sc., LLM .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Custers, B.H.M., Schermer, B.W. (2014). Responsibly Innovating Data Mining and Profiling Tools: A New Approach to Discrimination Sensitive and Privacy Sensitive Attributes. In: van den Hoven, J., Doorn, N., Swierstra, T., Koops, BJ., Romijn, H. (eds) Responsible Innovation 1. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-8956-1_19

Download citation

Publish with us

Policies and ethics