Abstract
Generalization and Suppression are two of the most used techniques to achieve k-anonymity. However, the generalization concept is also used in machine learning to obtain domain models useful for the classification task, and the suppression is the way to achieve such generalization. In this paper we want to address the anonymization of data preserving the classification task. What we propose is to use machine learning methods to obtain partial domain theories formed by partial descriptions of classes. Differently than in machine learning, we impose that such descriptions be as specific as possible, i.e., formed by the maximum number of attributes. This is achieved by suppressing some values of some records. In our method, we suppress only a particular value of an attribute in only a subset of records, that is, we use local suppression. This avoids one of the problems of global suppression that is the loss of more information than necessary.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Armengol, E.: Building partial domain theories from explanations. Knowl. Intell. 22(2), 19–24 (2008)
Armengol, E., Plaza, E.: Lazy induction of descriptions for relational case-based learning. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 13–24. Springer, Heidelberg (2001)
Armengol, E., Plaza, E.: Relational case-based reasoning for carcinogenic activity prediction. Artif. Intell. Rev. 20(1–2), 121–141 (2003)
Bache, K., Lichman, M.: UCI machine learning repository (2013)
Domingo-Ferrer, J., Torra, V.: Ordinal, continuous and heterogeneous k-anonymity through microaggregation. Data Min. Knowl. Discov. 11(2), 195–212 (2005)
Friedman, A., Wolff, R., Schuster, A.: Providing k-anonymity in data mining. VLDB J. 17(4), 789–804 (2008)
Friedman, J.H.: Lazy decision trees. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence, AAAI 1996, vol. 1, pp. 717–724. AAAI Press (1996)
Fung, B.C.M., Wang, K., Yu, P.S.: Anonymizing classification data for privacy preservation. IEEE Trans. Knowl. Data Eng. (TKDE) 19(5), 711–725 (2007)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2002, pp. 279–288. ACM, New York (2002)
Bayardo Jr., R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: Proceedings of the 21st International Conference on Data Engineering, ICDE 2005, Tokyo, Japan, 5–8 April 2005, pp. 217–228 (2005)
Kisilevich, S., Keim, D.A., Rokach, L.: A gis-based decision support system for hotel room rate estimation and temporal price prediction: the hotel brokers’ context. Decis. Support Syst. 54(2), 1119–1133 (2013)
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: efficient full-domain k-anonymity. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, SIGMOD 2005, pp. 49–60. ACM, New York (2005)
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: Proceedings of the 22nd International Conference on Data Engineering, ICDE 2006, Atlanta, GA, USA, 3–8 April 2006, p. 25 (2006)
López de Mántaras, R.: A distance-based attribute selection measure for decision tree induction. Mach. Learn. 6, 81–92 (1991)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)
Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)
Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: k-anonymity and itsenforcement through generalization and suppression. Technical report, SRI (1998)
Acknowledgments
This research is partially funded by the project RPREF (CSIC Intramural 201650E044) and the grants 2014-SGR-118 from the Generalitat de Catalunya.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Armengol, E., Torra, V. (2016). Partial Domain Theories for Privacy . In: Torra, V., Narukawa, Y., Navarro-Arribas, G., Yañez, C. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2016. Lecture Notes in Computer Science(), vol 9880. Springer, Cham. https://doi.org/10.1007/978-3-319-45656-0_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-45656-0_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45655-3
Online ISBN: 978-3-319-45656-0
eBook Packages: Computer ScienceComputer Science (R0)