Beyond the EULA: Improving Consent for Data Mining

  • Luke HuttonEmail author
  • Tristan Henderson
Part of the Studies in Big Data book series (SBD, volume 32)


Companies and academic researchers may collect, process, and distribute large quantities of personal data without the explicit knowledge or consent of the individuals to whom the data pertains. Existing forms of consent often fail to be appropriately readable and ethical oversight of data mining may not be sufficient. This raises the question of whether existing consent instruments are sufficient, logistically feasible, or even necessary, for data mining. In this chapter, we review the data collection and mining landscape, including commercial and academic activities, and the relevant data protection concerns, to determine the types of consent instruments used. Using three case studies, we use the new paradigm of human-data interaction to examine whether these existing approaches are appropriate. We then introduce an approach to consent that has been empirically demonstrated to improve on the state of the art and deliver meaningful consent. Finally, we propose some best practices for data collectors to ensure their data mining activities do not violate the expectations of the people to whom the data relate.


Data Mining Personal Data Social Network Site Data Owner Contextual Integrity 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was supported by the Engineering and Physical Sciences Research Council [grant number EP/L021285/1].


  1. 1.
    Akkad, A., Jackson, C., Kenyon, S., Dixon-Woods, M., Taub, N., Habiba, M.: Patients’ perceptions of written consent: questionnaire study. Br. Med. J. 333 (7567), 528+ (2006). doi:10.1136/bmj.38922.516204.55Google Scholar
  2. 2.
    Ayalon, O., Toch, E.: Retrospective privacy: managing longitudinal privacy in online social networks. In: Proceedings of the Ninth Symposium on Usable Privacy and Security. ACM, New York (2013). doi:10.1145/2501604.2501608CrossRefGoogle Scholar
  3. 3.
    Barnes, S.B.: A privacy paradox: social networking in the United States. First Monday 11 (9) (2006). doi:10.5210/fm.v11i9.1394Google Scholar
  4. 4.
    Bauer, L., Cranor, L.F., Komanduri, S., Mazurek, M.L., Reiter, M.K., Sleeper, M., Ur, B.: The post anachronism: the temporal dimension of Facebook privacy. In: Proceedings of the 12th ACM Workshop on Workshop on Privacy in the Electronic Society, pp. 1–12. ACM, New York (2013). doi:10.1145/2517840.2517859Google Scholar
  5. 5.
    Berg, J.W., Appelbaum, P.S.: Informed Consent Legal Theory and Clinical Practice. Oxford University Press, Oxford (2001)Google Scholar
  6. 6.
    Brown, I., Brown, L., Korff, D.: Using NHS patient data for research without consent. Law Innov. Technol. 2 (2), 219–258 (2010). doi: 10.5235/175799610794046186 CrossRefGoogle Scholar
  7. 7.
    Carmichael, L., Stalla-Bourdillon, S., Staab, S.: Data mining and automated discrimination: a mixed legal/technical perspective. IEEE Intell. Syst. 31 (6), 51–55 (2016). doi:10.1109/mis.2016.96CrossRefGoogle Scholar
  8. 8.
    Donovan-Kicken, E., Mackert, M., Guinn, T.D., Tollison, A.C., Breckinridge, B.: Sources of patient uncertainty when reviewing medical disclosure and consent documentation. Patient Educ. Couns. 90 (2), 254–260 (2013). doi:10.1016/j.pec.2012.10.007CrossRefGoogle Scholar
  9. 9.
    Eslami, M., Rickman, A., Vaccaro, K., Aleyasen, A., Vuong, A., Karahalios, K., Hamilton, K., Sandvig, C.: “I always assumed that I wasn’t really that close to [her]”: reasoning about invisible algorithms in news feeds. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, CHI ’15, pp. 153–162. ACM, New York (2015). doi:10.1145/2702123.2702556Google Scholar
  10. 10.
    European Parliament and the Council of the European Union: Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data. Off. J. Eur. Union L 281, 0031–0050 (1995)Google Scholar
  11. 11.
    Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery in databases. AI Mag. 17 (3) (1996). doi:10.1609/aimag.v17i3.1230Google Scholar
  12. 12.
    Friedman, B., Lin, P., Miller, J.K.: Informed consent by design. In: Cranor, L.F., Garfinkel, S. (eds.) Security and Usability, Chap. 24, pp. 495–521. O’Reilly Media, Sebastopol (2005)Google Scholar
  13. 13.
    Gomer, R., Schraefel, M.C., Gerding, E.: Consenting agents: semi-autonomous interactions for ubiquitous consent. In: Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication, UbiComp ’14 Adjunct, pp. 653–658. ACM, New York (2014). doi:10.1145/2638728.2641682Google Scholar
  14. 14.
    Hamnes, B., van Eijk-Hustings, Y., Primdahl, J.: Readability of patient information and consent documents in rheumatological studies. BMC Med. Ethics 17 (1) (2016). doi:10.1186/s12910-016-0126-0Google Scholar
  15. 15.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, corrected edn. Springer, New York (2003)Google Scholar
  16. 16.
    Heimbach, I., Gottschlich, J., Hinz, O.: The value of user’s Facebook profile data for product recommendation generation. Electr. Mark. 25 (2), 125–138 (2015). doi: 10.1007/s12525-015-0187-9 CrossRefGoogle Scholar
  17. 17.
    Hektner, J.M., Schmidt, J.A., Csikszentmihalyi, M.: Experience Sampling Method: Measuring the Quality of Everyday Life. SAGE Publications, Thousand Oaks (2007)CrossRefGoogle Scholar
  18. 18.
    Hill, K.: Facebook Added ‘Research’ To User Agreement 4 Months After Emotion Manipulation Study. (2014). Accessed 30 Nov 2016
  19. 19.
    Hoadley, C.M., Xu, H., Lee, J.J., Rosson, M.B.: Privacy as information access and illusory control: the case of the Facebook News Feed privacy outcry. Electron. Commer. Res. Appl. 9 (1), 50–60 (2010). doi: 10.1016/j.elerap.2009.05.001 CrossRefGoogle Scholar
  20. 20.
    Hodson, H.: Did Google’s NHS patient data deal need ethical approval?. (2016). Accessed 30 Nov 2016
  21. 21.
    Hodson, H.: Google knows your ills. New Sci. 230 (3072), 22–23 (2016). doi: 10.1016/s0262-4079(16)30809-0 CrossRefGoogle Scholar
  22. 22.
    Hutton, L., Henderson, T.: “I didn’t sign up for this!”: informed consent in social network research. In: Proceedings of the 9th International AAAI Conference on Web and Social Media, pp. 178–187 (2015).
  23. 23.
    Jackman, M., Kanerva, L.: Evolving the IRB: building robust review for industry research. Wash. Lee Law Rev. Online 72 (3), 442–457 (2016). Google Scholar
  24. 24.
    Kang, J., Shilton, K., Estrin, D., Burke, J., Hansen, M.: Self-surveillance privacy. Iowa Law Rev. 97 (3), 809–848 (2012). doi: 10.2139/ssrn.1729332 Google Scholar
  25. 25.
    Kaye, J., Whitley, E.A., Lund, D., Morrison, M., Teare, H., Melham, K.: Dynamic consent: a patient interface for twenty-first century research networks. Eur. J. Hum. Genet. 23 (2), 141–146 (2014). doi: 10.1038/ejhg.2014.71 CrossRefGoogle Scholar
  26. 26.
    Kramer, A.D.I., Guillory, J.E., Hancock, J.T.: Experimental evidence of massive-scale emotional contagion through social networks. Proc. Natl. Acad. Sci. 111 (24), 8788–8790 (2014). doi: 10.1073/pnas.1320040111 CrossRefGoogle Scholar
  27. 27.
    Lewis, K., Kaufman, J., Gonzalez, M., Wimmer, A., Christakis, N.: Tastes, ties, and time: a new social network dataset using Soc. Netw. 30 (4), 330–342 (2008). doi: 10.1016/j.socnet.2008.07.002
  28. 28.
    Luger, E., Rodden, T.: An informed view on consent for UbiComp. In: Proceedings of the 2013 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 529–538. ACM, New York (2013). doi:10.1145/2493432.2493446Google Scholar
  29. 29.
    Luger, E., Moran, S., Rodden, T.: Consent for all: revealing the hidden complexity of terms and conditions. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2687–2696. ACM, New York (2013). doi: 10.1145/2470654.2481371
  30. 30.
    McDonald, A.M., Cranor, L.F.: The cost of reading privacy policies. I/S: J. Law Policy Inf. Soc. 4 (3), 540–565 (2008).
  31. 31.
    Miller, F.G., Wertheimer, A.: Preface to a theory of consent transactions: beyond valid consent. In: Miller, F., Wertheimer, A. (eds.) The Ethics of Consent, Chap. 4, pp. 79–105. Oxford University Press, Oxford (2009). doi:10.1093/acprof:oso/9780195335149.003.0004CrossRefGoogle Scholar
  32. 32.
    Moran, S., Luger, E., Rodden, T.: Exploring patterns as a framework for embedding consent mechanisms in human-agent collectives. In: Ślȩzak, D., Schaefer, G., Vuong, S., Kim, Y.S. (eds.) Active Media Technology. Lecture Notes in Computer Science, vol. 8610, pp. 475–486. Springer International Publishing, New York (2014). doi:10.1007/978-3-319-09912-5_40Google Scholar
  33. 33.
    Morrison, A., McMillan, D., Chalmers, M.: Improving consent in large scale mobile HCI through personalised representations of data. In: Proceedings of the 8th Nordic Conference on Human-Computer Interaction: Fun, Fast, Foundational, NordiCHI ’14, pp. 471–480. ACM, New York (2014). doi:10.1145/2639189.2639239Google Scholar
  34. 34.
    Mortier, R., Haddadi, H., Henderson, T., McAuley, D., Crowcroft, J., Crabtree, A.: Human-data interaction. In: Soegaard, M., Dam, R.F. (eds.) Encyclopedia of Human-Computer Interaction, Chap. 41. Interaction Design Foundation, Aarhus (2016).
  35. 35.
    Munteanu, C., Molyneaux, H., Moncur, W., Romero, M., O’Donnell, S., Vines, J.: Situational ethics: Re-thinking approaches to formal ethics requirements for human-computer interaction. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 105–114. ACM, New York (2015). doi:10.1145/2702123.2702481Google Scholar
  36. 36.
    Napoli, P.M.: Social media and the public interest: governance of news platforms in the realm of individual and algorithmic gatekeepers. Telecommun. Policy 39 (9), 751–760 (2015). doi:10.1016/j.telpol.2014.12.003CrossRefGoogle Scholar
  37. 37.
    Narayanan, A., Shmatikov, V.: De-anonymizing social networks. In: Proceedings of the IEEE Symposium on Security and Privacy, pp. 173–187. IEEE, Los Alamitos, CA (2009). doi:10.1109/sp.2009.22Google Scholar
  38. 38.
    Nissenbaum, H.: Privacy in Context: Technology, Policy, and the Integrity of Social Life. Stanford Law Books, Stanford, CA (2009)Google Scholar
  39. 39.
    Obar, J.A., Oeldorf-Hirsch, A.: The biggest lie on the internet: ignoring the privacy policies and terms of service policies of social networking services. Social Science Research Network Working Paper Series (2016). doi: 10.2139/ssrn.2757465 Google Scholar
  40. 40.
    Patrick, A.: Just-in-time click-through agreements: interface widgets for confirming informed, unambiguous consent. J. Internet Law 9 (3), 17–19 (2005).
  41. 41.
    Pitofsky, R., Anthony, S.F., Thompson, M.W., Swindle, O., Leary, T.B.: Privacy online: fair information practices in the electronic marketplace: a report to congress. Security. (2000)
  42. 42.
    Recuber, T.: From obedience to contagion: discourses of power in Milgram, Zimbardo, and the Facebook experiment. Res. Ethics 12 (1), 44–54 (2016). doi:10.1177/1747016115579533CrossRefGoogle Scholar
  43. 43.
    Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation), pp. 1–88. Off. J. Eur. Union L119/1 (2016)Google Scholar
  44. 44.
    Sankar, P., Mora, S., Merz, J.F., Jones, N.L.: Patient perspectives of medical confidentiality. J. Gen. Inter. Med. 18 (8), 659–669 (2003). doi:10.1046/j.1525-1497.2003.20823.xCrossRefGoogle Scholar
  45. 45.
    Selinger, E., Hartzog, W.: Facebook’s emotional contagion study and the ethical problem of co-opted identity in mediated environments where users lack control. Res. Ethics 12 (1), 35–43 (2016). doi:10.1177/1747016115579531CrossRefGoogle Scholar
  46. 46.
    Sleeper, M., Balebako, R., Das, S., McConahy, A.L., Wiese, J., Cranor, L.F.: The post that wasn’t: exploring self-censorship on Facebook. In: Proceedings of the 2013 Conference on Computer Supported Cooperative Work, CSCW 2013, pp. 793–802. ACM, New York (2013). doi:10.1145/2441776.2441865Google Scholar
  47. 47.
    Solove, D.J.: Privacy self-management and the consent dilemma. Harv. Law Rev. 126 (7), 1880–1903 (2013).
  48. 48.
    Staiano, J., Oliver, N., Lepri, B., de Oliveira, R., Caraviello, M., Sebe, N.: Money walks: a human-centric study on the economics of personal mobile data. In: Proceedings of Ubicomp 2014 (2014). doi:10.1145/2632048.2632074CrossRefGoogle Scholar
  49. 49.
    Steinke, G.: Data privacy approaches from US and EU perspectives. Telematics Inform. 19 (2), 193–200 (2002). doi:10.1016/s0736-5853(01)00013-2CrossRefGoogle Scholar
  50. 50.
    Steinsbekk, K.S., Kare Myskja, B., Solberg, B.: Broad consent versus dynamic consent in biobank research: is passive participation an ethical problem? Eur. J. Hum. Genet. 21 (9), 897–902 (2013). doi:10.1038/ejhg.2012.282CrossRefGoogle Scholar
  51. 51.
    Tankard, C.: What the GDPR means for businesses. Netw. Secur. 2016 (6), 5–8 (2016). doi:10.1016/s1353-4858(16)30056-3CrossRefGoogle Scholar
  52. 52.
    Vitak, J., Shilton, K., Ashktorab, Z.: Beyond the Belmont Principles: ethical challenges, practices, and beliefs in the online data research community. In: Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work and Social Computing, pp. 941–953. ACM, New York (2016). doi:10.1145/2818048.2820078Google Scholar
  53. 53.
    Vučemilo, L., Borovečki, A.: Readability and content assessment of informed consent forms for medical procedures in Croatia. PLoS One 10 (9), e0138,017+ (2015). doi:10.1371/journal.pone.0138017Google Scholar
  54. 54.
    Williams, H., Spencer, K., Sanders, C., Lund, D., Whitley, E.A., Kaye, J., Dixon, W.G.: Dynamic consent: a possible solution to improve patient confidence and trust in how electronic patient records are used in medical research. JMIR Med. Inform. 3 (1), e3+ (2015). doi:10.2196/medinform.3525Google Scholar
  55. 55.
    World Economic Forum: Personal data: the emergence of a new asset class. (2011)
  56. 56.
    Zimmer, M.: “But the data is already public”: on the ethics of research in Facebook. Ethics Inf. Technol. 12 (4), 313–325 (2010). doi:10.1007/s10676-010-9227-5CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Centre for Research in ComputingThe Open UniversityMilton KeynesUK
  2. 2.School of Computer ScienceUniversity of St AndrewsSt AndrewsUK

Personalised recommendations