Abstract
Data mining is a technology that extracts useful information, such as patterns and trends, from large amounts of data. The privacy sensitive input data and the output data that is often used for selections deserve protection against abuse. In this paper we describe one of the main results of our research project on developing new privacy preserving and discrimination aware data mining tools, namely why the common measures for mitigating privacy and discrimination concerns, such as a priori limiting measures (particularly access controls, anonymity and purpose specification) are mechanisms that are increasingly failing solutions against privacy and discrimination issues in the novel context of advanced data mining and profiling. Contrary to previous attempts to protect privacy and prevent discrimination in data mining, we did not focus on new designs that better enable (a priori) access limiting measures regarding input data, but rather focused on (a posteriori) responsibility and transparency. Instead of limiting access to data, which is increasingly hard to enforce in a world of automated and interlinked databases and information networks, rather the question how data can and may be used was stressed.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
Directive 95/46/EG of the European Parliament and the Council of 24th October 1995, [1995] OJ L281/31.
- 3.
Proposal for a Regulation of the European Parliament and of the Council on the protection of individuals with regard to the processing of personal data and on the free movement of such data (General Data Protection Regulation), Brussels, 25.1.2012 COM(2012) 11 final 2012/0011 (COD).
References
Adriaans, P., and D. Zantinge. 1996. Data mining. Harlow: Addison Wesley Longman.
Bygrave, L.A. 2002. Data protection law; approaching its rationale, logic and limits, Information law series, vol. 10. The Hague/London/New York: Kluwer Law International.
Calders, T. 2007. The complexity of satisfying constraints on transaction databases. Acta Informatica 44(7–8): 591–624.
Calders, T. 2008. Itemset frequency satisfiability: Complexity and axiomatization. Theoretical Computer Science 394(1–2): 84–111.
Calders, T., and S. Verwer. 2010. Three Naive Bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery, September 2010, Vol. 21, Issue 2, pp. 277–292.
Chawla, N.V., K.W. Bowyer, L.O. Hall, and W.P. Kegelmeyer. 2002. Smote: Synthetic minority over-sampling technique. International Journal of Artificial Intelligence Research (JAIR) 16: 321–357.
Cocx, T.K. 2009. Algorithmic tools for data-oriented law enforcement, PhD thesis, University of Leiden.
Custers, B.H.M. 2004. The power of knowledge. Tilburg: Wolf Legal Publishers.
Custers, B.H.M. 2010. Data mining with discrimination sensitive and privacy sensitive attributes. In Proceedings of ISP 2010, international conference on information security and privacy, 12–14, July 2010, Orlando, Florida.
Custers, B., T. Calders, B. Schermer, and T. Zarsky. 2013. Discrimination and privacy in the information society; data mining and profiling in large databases. Heidelberg: Springer.
Del Carmen, A. 2007. Racial profiling in America. Upper Sadle River: Prentice Hall.
Denning, D. 1983. Cryptography and data security. Amsterdam: Addison-Wesley.
Fayyad, U.-M., G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy. 1996. Advances in knowledge discovery and data mining. Menlo Park: AAAI Press/The MIT Press.
Goldberg, I.A. 2000. A pseudonymous communications infrastructure for the Internet, dissertation, Berkeley: University of California at Berkeley.
Hornung, G. 2012. A general data protection regulation for Europe? Light and shade in the commission’s draft of 25 January 2012. SCRIPTed 9(1): 64–81.
Kamiran, F., and T. Calders. 2009. Classification without discrimination. In IEEE international conference on computer, control & communication (IEEE-IC4), 17–19 February 2009, Karachi, Pakistan.
Kamiran, F., and T. Calders. 2010. Exploiting independency constraints for classification. http://wwwis.win.tue.nl
Kuner, Chr. (2012) The European Commission’s proposed data protection regulation: A Copernican Revolution in European Data Protection Law. Privacy and security law report, 6 February 2012.
Lindell, Y., and B. Pinkas. 2002. Privacy preserving data mining. Journal of Cryptology 15(3): 177–206.
Mannila, H., D. Hand, and P. Smith. 2001. Principles of data mining. Cambridge, MA: MIT Press.
Meeks, K. 2000. Driving while black. New York: Broadway Books.
Ohm, P. 2010. Broken promises of privacy: Responding to the surprising failure of anonymization. UCLA Law Review 57: 1701.
Pearl, D. 2009. Causality: Models, reasoning, and inference, 2nd ed. Cambridge: Cambridge University Press.
Pedreschi, D., R. Ruggieri, and F. Turini. 2008. Discrimination-aware data mining. In Proceedings of the 14th ACM SIGKDD conference on knowledge discovery and data mining. New York: ACM, pp. 560–568
Robinson, N., H. Graux, M. Botterman, and L. Valeri. 2009. Review of the European data protection directive. Cambridge: RAND Europe.
Schermer, B.W. 2007. Software agents, surveillance, and the right to privacy: A legislative framework for agent-enabled surveillance, PhD thesis, Leiden University.
Schermer, B.W. 2011. The limits of privacy in automated profiling and data mining. Computer Law & Security Review 27(7): 45–52.
Solove, D. 2004. The digital person; technology and privacy in the information age. New York: New York University Press.
van den Hoven, M.J. 1997. Privacy and the varieties of informational wrongdoing in an information age. Computers and Society 27(2): 33–37.
Vedder, A.H. 1999. KDD: The challenge to individualism. Ethics and Information Technology 1(4): 275–281.
Weitzner, D.J., H. Abelson, et al. 2006. Transparent accountable data mining: New strategies for privacy protection, MIT technical report. Cambridge: MIT.
Westin, A. 1967. Privacy and freedom. London: Bodley Head.
Withrow, B. 2006. Racial profiling. Upper Sadle River: Prentice Hall.
Zarsky, T.Z. 2003, Mine your own business! Making the case for the implications of the data mining of personal information in the forum of public opinion. Yale Journal of Law and Technology 5: 1–57.
Zarsky, T.Z. 2006. Chapter 12: Online privacy, tailoring, and persuasion. In Privacy and technologies of identity, a cross disciplinary conversation, ed. K. Strandburg and D. Stan Raicu, 209–224. New York: Springer.
Acknowledgements
The authors would like to thank the Netherlands Organization for Scientific Research (NWO) for enabling this research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Custers, B.H.M., Schermer, B.W. (2014). Responsibly Innovating Data Mining and Profiling Tools: A New Approach to Discrimination Sensitive and Privacy Sensitive Attributes. In: van den Hoven, J., Doorn, N., Swierstra, T., Koops, BJ., Romijn, H. (eds) Responsible Innovation 1. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-8956-1_19
Download citation
DOI: https://doi.org/10.1007/978-94-017-8956-1_19
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-017-8955-4
Online ISBN: 978-94-017-8956-1
eBook Packages: Humanities, Social Sciences and LawPhilosophy and Religion (R0)