Skip to main content

ARUBA: A Risk-Utility-Based Algorithm for Data Disclosure

  • Conference paper
Secure Data Management (SDM 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5159))

Included in the following conference series:

Abstract

Dealing with sensitive data has been the focus of much of recent research. On one hand data disclosure may incur some risk due to security breaches, but on the other hand data sharing has many advantages. For example, revealing customer transactions at a grocery store may be beneficial when studying purchasing patterns and market demand. However, a potential misuse of the revealed information may be harmful due to privacy violations. In this paper we study the tradeoff between data disclosure and data retention. Specifically, we address the problem of minimizing the risk of data disclosure while maintaining its utility above a certain acceptable threshold. We formulate the problem as a discrete optimization problem and leverage the special monotonicity characteristics for both risk and utility to construct an efficient algorithm to solve it. Such an algorithm determines the optimal transformations that need to be performed on the microdata before it gets released. These optimal transformations take into account both the risk associated with data disclosure and the benefit of it (referred to as utility). Through extensive experimental studies we compare the performance of our proposed algorithm with other date disclosure algorithms in the literature in terms of risk, utility, and time. We show that our proposed framework outperforms other techniques for sensitive data disclosure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: ICDE 2005: Proceedings of the 21st International Conference on Data Engineering, Washington, DC, USA, pp. 217–228. IEEE Computer Society Press, Los Alamitos (2005)

    Google Scholar 

  2. Cheng, P.-C., Rohatgi, P., Keser, C., Karger, P.A., Wagner, G.M., Reninger, A.S.: Fuzzy multi-level security: An experiment on quantified risk-adaptive access control. In: SP 2007: Proceedings of the 2007 IEEE Symposium on Security and Privacy, Washington, DC, USA, pp. 222–230. IEEE Computer Society Press, Los Alamitos (2007)

    Google Scholar 

  3. Fung, B.C.M., Wang, K., Yu, P.S.: Top-down specialization for information and privacy preservation. In: Proc. of the 21st IEEE International Conference on Data Engineering (ICDE 2005), Tokyo, Japan, April 2005, pp. 205–216. IEEE Computer Society Press, Los Alamitos (2005)

    Chapter  Google Scholar 

  4. Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: KDD 2002: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 279–288 (2002)

    Google Scholar 

  5. Jaro, M.: UNIMATCH: A record linkage system, user’s manual. In: U.S. Bureau of the Census (1978)

    Google Scholar 

  6. Lawler, E.L., Wood, D.E.: Branch-and-bound methods: A survey. Operations Research 14(4) (1966)

    Google Scholar 

  7. Lebanon, G., Scannapieco, M., Fouad, M.R., Bertino, E.: Beyond k-anonymity: A decision theoretic framework for assessing privacy risk. In: Domingo-Ferrer, J., Franconi, L. (eds.) PSD 2006. LNCS, vol. 4302. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  8. LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: Efficient full-domain k-anonymity. In: SIGMOD Conference, pp. 49–60 (2005)

    Google Scholar 

  9. Li, T., Li, N.: t-closeness: Privacy beyond k-anonymity and l-diversity. In: Proc. of ICDE (2007)

    Google Scholar 

  10. Liu, L., Kantarcioglu, M., Thuraisingham, B.: The applicability of the perturbation based privacy preserving data mining for real-world data. Data Knowl. Eng. 65(1), 5–21 (2008)

    Article  Google Scholar 

  11. Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: l-diversity: Privacy beyond k-anonymity. In: ICDE (2006)

    Google Scholar 

  12. Rastogi, V., Suciu, D., Hong, S.: The boundary between privacy and utility in data publishing. In: VLDB 2007: Proceedings of the 33rd international conference on Very large data bases, pp. 531–542 (2007)

    Google Scholar 

  13. Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)

    Article  Google Scholar 

  14. Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information. In: Proc. of PODS (1998)

    Google Scholar 

  15. Sweeney, L.: Privacy-enhanced linking. ACM SIGKDD Explorations 7(2) (2005)

    Google Scholar 

  16. Wang, K., Yu, P.S., Chakraborty, S.: Bottom-up generalization: A data mining solution to privacy protection. In: ICDM 2004, pp. 249–256. IEEE Computer Society, Los Alamitos (2004)

    Chapter  Google Scholar 

  17. Xiao, X., Tao, Y.: Personalized privacy preservation. In: Proc. of SIGMOD (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Willem Jonker Milan Petković

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fouad, M.R., Lebanon, G., Bertino, E. (2008). ARUBA: A Risk-Utility-Based Algorithm for Data Disclosure. In: Jonker, W., Petković, M. (eds) Secure Data Management. SDM 2008. Lecture Notes in Computer Science, vol 5159. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85259-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85259-9_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85258-2

  • Online ISBN: 978-3-540-85259-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics