Skip to main content

Privacy and Policy in Polystores: A Data Management Research Agenda

  • Conference paper
  • First Online:
Heterogeneous Data Management, Polystores, and Analytics for Healthcare (DMAH 2019, Poly 2019)

Abstract

Modern data-driven technologies are providing new capabilities for working with data across diverse storage architectures and analyzing it in unified frameworks to yield powerful insights. These new analysis capabilities, which rely on correlating data across sources and types and exploiting statistical structure, have challenged classical approaches to privacy, leading to a need for radical rethinking of the meaning of the term. In the area of federated database technologies, there is a growing recognition that new technologies like polystores must incorporate the mitigation of privacy risks into the design process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    \(2^{33} = 8,589,934,592\), compared to a world population of around 7.7 billion.

  2. 2.

    Frameworks have been proposed to grapple with the multi-dimensional nature of privacy. See Mulligan et. al.’s privacy analytic. [27].

References

  1. Albarghouthi, A., D’Antoni, L., Drews, S., Nori, A.: Fairness as a program property. arXiv preprint arXiv:1610.06067 (2016)

  2. Anderson, R.: Security Engineering. Wiley, New York (2008)

    Google Scholar 

  3. Bater, J., He, X., Ehrich, W., Machanavajjhala, A., Rogers, J.: Shrinkwrap: efficient SQL query processing in differentially private data federations. Proc. VLDB Endowment 12(3), 307–320 (2018)

    Article  Google Scholar 

  4. Bhowmick, A., Duchi, J., Freudiger, J., Kapoor, G., Rogers, R.: Protection against reconstruction and its applications in private federated learning. arXiv preprint arXiv:1812.00984 (2018)

  5. Bruening, P.J., Waterman, K.K.: Data tagging for new information governance models. IEEE Secur. Priv. 8(5), 64–68 (2010)

    Article  Google Scholar 

  6. Cohen, A., Nissim, K.: Towards formalizing the GDPR notion of singling out. arXiv preprint arXiv:1904.06009 (2019)

  7. Cranor, L.F., Idouchi, K., Leon, P.G., Sleeper, M., Ur, B.: Are they actually any different? Comparing thousands of financial institutions’ privacy practices. In: Proceedings of the WEIS, vol. 13 (2013)

    Google Scholar 

  8. Datta, A., Sen, S., Zick, Y.: Algorithmic transparency via quantitative input influence: theory and experiments with learning systems. In: IEEE Symposium on Security and Privacy (SP), pp. 598–617. IEEE (2016)

    Google Scholar 

  9. Duggan, J., et al.: The BigDAWG polystore system. ACM Sigmod Rec. 44(2), 11–16 (2015)

    Article  Google Scholar 

  10. Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., Naor, M.: Our data, ourselves: privacy via distributed noise generation. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 486–503. Springer, Heidelberg (2006). https://doi.org/10.1007/11761679_29

    Chapter  Google Scholar 

  11. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006). https://doi.org/10.1007/11681878_14

    Chapter  Google Scholar 

  12. Dwork, C., Rothblum, G.N., Vadhan, S.: Boosting and differential privacy. In: 51st Annual Symposium on Foundations of Computer Science, pp. 51–60. IEEE (2010)

    Google Scholar 

  13. Federal Trade Commission: FTC Policy Statement on Deception. 103 F.T.C. 110, 174 (1984). https://www.ftc.gov/public-statements/1983/10/ftc-policy-statement-deception

  14. Federal Trade Commission: FTC Policy Statement on Unfairness. 104 F.T.C. 949, 1070 (1984). https://www.ftc.gov/public-statements/1980/12/ftc-policy-statement-unfairness

  15. Feigenbaum, J., Weitzner, D.J.: On the incommensurability of laws and technical mechanisms: or, what cryptography can’t do. In: Matyáš, V., Švenda, P., Stajano, F., Christianson, B., Anderson, J. (eds.) Security Protocols 2018. LNCS, vol. 11286, pp. 266–279. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03251-7_31

    Chapter  Google Scholar 

  16. Gellman, R.: Fair information practices: a basic history. SSRN 2415020 (2017)

    Google Scholar 

  17. Hoofnagle, C.J.: Federal Trade Commission Privacy Law and Policy. Cambridge University Press, Cambridge (2016)

    Book  Google Scholar 

  18. Hsu, J., et al.: Differential privacy: an economic method for choosing epsilon. In: 27th Computer Security Foundations Symposium, pp. 398–410. IEEE (2014)

    Google Scholar 

  19. Johnson, N., Near, J.P., Song, D.: Towards practical differential privacy for SQL queries. Proc. VLDB Endowment 11(5), 526–539 (2018)

    Google Scholar 

  20. Kamarinou, D., Millard, C., Oldani, I.: Compliance as a service. Queen Mary School of Law Legal Studies Research Paper, No. 287/2018 (2018)

    Google Scholar 

  21. Kohli, N., Laskowski, P.: Epsilon voting: mechanism design for parameter selection in differential privacy. In: IEEE Symposium on Privacy-Aware Computing (PAC), pp. 19–30. IEEE (2018)

    Google Scholar 

  22. Kroll, J.A., et al.: Accountable algorithms. Univ. PA. Law Rev. 165(3), 633–705 (2017)

    Google Scholar 

  23. Lee, J., Clifton, C.: How much is enough? Choosing \(\varepsilon \) for differential privacy. In: Lai, X., Zhou, J., Li, H. (eds.) ISC 2011. LNCS, vol. 7001, pp. 325–340. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24861-0_22

    Chapter  Google Scholar 

  24. Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: L-Diversity: privacy beyond k-anonymity. In: 22nd International Conference on Data Engineering (ICDE 2006), pp. 24–24. IEEE (2006)

    Google Scholar 

  25. McSherry, F.D.: Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, pp. 19–30. ACM (2009)

    Google Scholar 

  26. Mironov, I.: On significance of the least significant bits for differential privacy. In: Proceedings of the 2012 ACM Conference on Computer and Communications Security, pp. 650–661. ACM (2012)

    Google Scholar 

  27. Mulligan, D.K., Koopman, C., Doty, N.: Privacy is an essentially contested concept: a multi-dimensional analytic for mapping privacy. Philos. Trans. R. Soc. A 374(2083), 20160118 (2016)

    Article  Google Scholar 

  28. Nabar, S.U., Kenthapadi, K., Mishra, N., Motwani, R.: A survey of query auditing techniques for data privacy. In: Aggarwal, C.C., Yu, P.S. (eds.) Privacy-Preserving Data Mining. Advances in Database Systems, vol. 34, pp. 415–431. Springer, Boston (2008). https://doi.org/10.1007/978-0-387-70992-5_17

    Chapter  Google Scholar 

  29. Naldi, M., D’Acquisto, G.: Differential privacy: an estimation theory-based method for choosing epsilon. arXiv preprint arXiv:1510.00917 (2015)

  30. Narayanan, A., Felten, E.W.: No silver bullet: de-identification still doesn’t work. Manuscript (2014)

    Google Scholar 

  31. Narayanan, A., Huey, J., Felten, E.W.: A precautionary approach to big data privacy. In: Gutwirth, S., Leenes, R., De Hert, P. (eds.) Data Protection on the Move. LGTS, vol. 24, pp. 357–385. Springer, Dordrecht (2016). https://doi.org/10.1007/978-94-017-7376-8_13

    Chapter  Google Scholar 

  32. Narayanan, A., Shmatikov, V.: Robust de-anonymization of large, sparse datasets. In: IEEE Security and Privacy (2008)

    Google Scholar 

  33. Narayanan, A., Shmatikov, V.: Myths and fallacies of personally identifiable information. Commun. ACM 53(6), 24–26 (2010)

    Article  Google Scholar 

  34. Nissim, K., et al.: Bridging the gap between computer science and legal approaches to privacy. Harvard J. Law Technol. 31(2), 687–780 (2018)

    Google Scholar 

  35. Papernot, N., Abadi, M., Erlingsson, U., Goodfellow, I., Talwar, K.: Semi-supervised knowledge transfer for deep learning from private training data. arXiv preprint arXiv:1610.05755 (2016)

  36. Selbst, A.D., Powles, J.: Meaningful information and the right to explanation. Int. Data Priv. Law 7(4), 233–242 (2017)

    Article  Google Scholar 

  37. Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 10(05), 557–570 (2002)

    Article  MathSciNet  Google Scholar 

  38. Truex, S., Baracaldo, N., Anwar, A., Steinke, T., Ludwig, H., Zhang, R.: A hybrid approach to privacy-preserving federated learning. arXiv preprint arXiv:1812.03224 (2018)

  39. United States Department of Health: Education, and Welfare: Secretary’s Advisory Committee on Automated Personal Data Systems, Records, Computers, and the Rights of Citizens: Report. MIT Press (1973)

    Google Scholar 

  40. Wachter, S., Mittelstadt, B.: A right to reasonable inferences: re-thinking data protection law in the age of big data and AI. Columbia Bus. Law Rev. (2018)

    Google Scholar 

  41. Warren, S., Brandeis, L.: The right to privacy. Harvard Law Rev. 4, 193–220 (1890)

    Article  Google Scholar 

  42. Wu, X., Li, F., Kumar, A., Chaudhuri, K., Jha, S., Naughton, J.: Bolt-on differential privacy for scalable stochastic gradient descent-based analytics. In: Proceedings of the 2017 ACM International Conference on Management of Data, pp. 1307–1322. ACM (2017)

    Google Scholar 

Download references

Acknowledgements

The work of authors Kroll and Kohli was supported in part by the National Security Agency (NSA). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSA. Kroll was also supported by the Berkeley Center for Law and Technology at the University of California, Berkeley Law School. Author Laskowski was supported by the Center for Long Term Cybersecurity at the University of California, Berkeley.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joshua A. Kroll .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kroll, J.A., Kohli, N., Laskowski, P. (2019). Privacy and Policy in Polystores: A Data Management Research Agenda. In: Gadepally, V., et al. Heterogeneous Data Management, Polystores, and Analytics for Healthcare. DMAH Poly 2019 2019. Lecture Notes in Computer Science(), vol 11721. Springer, Cham. https://doi.org/10.1007/978-3-030-33752-0_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-33752-0_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-33751-3

  • Online ISBN: 978-3-030-33752-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics