Automatic Construction of Generalization Hierarchies for Publishing Anonymized Data

Ayala-Rivera, Vanessa; Murphy, Liam; Thorpe, Christina

doi:10.1007/978-3-319-47650-6_21

Automatic Construction of Generalization Hierarchies for Publishing Anonymized Data

Vanessa Ayala-Rivera¹⁵,
Liam Murphy¹⁵ &
Christina Thorpe¹⁵

Conference paper
First Online: 05 October 2016

1635 Accesses
1 Citations
2 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9983))

Abstract

Concept hierarchies are widely used in multiple fields to carry out data analysis. In data privacy, they are known as Value Generalization Hierarchies (VGHs), and are used by generalization algorithms to dictate the data anonymization. Thus, their proper specification is critical to obtain anonymized data of good quality. The creation and evaluation of VGHs require expert knowledge and a significant amount of manual effort, making these tasks highly error-prone and time-consuming. In this paper we present AIKA, a knowledge-based framework to automatically construct and evaluate VGHs for the anonymization of categorical data. AIKA integrates ontologies to objectively create and evaluate VGHs. It also implements a multi-dimensional reward function to tailor the VGH evaluation to different use cases. Our experiments show that AIKA improved the creation of VGHs by generating VGHs of good quality in less time than when manually done. Results also showed how the reward function properly captures the desired VGH properties.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Chicago Homicides. https://data.cityofchicago.org
Insurance. https://github.com/ucd-pel/Datasets/tree/master/Insurance
UTD ToolBox. http://cs.utdallas.edu/dspl/cgi-bin/toolbox/
WS4J library. https://code.google.com/p/ws4j/
Ayala-Rivera, V., McDonagh, P., Cerqueus, T., Murphy, L.: A systematic comparison and evaluation of k -anonymization algorithms for practitioners. Trans. Data Priv. 7(3), 337–370 (2014)
MathSciNet Google Scholar
Ayala-Rivera, V., McDonagh, P., Cerqueus, T., Murphy, L.: Ontology-based quality evaluation of value generalization hierarchies for data anonymization. In: PSD (2014)
Google Scholar
Banerjee, S., Pedersen, T.: An adapted lesk algorithm for word sense disambiguation using WordNet. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, pp. 136–145. Springer, Heidelberg (2002). doi:10.1007/3-540-45715-1_11
Chapter Google Scholar
Campan, A., Cooper, N., Truta, T.M.: On-the-fly generalization hierarchies for numerical attributes revisited. In: Jonker, W., Petković, M. (eds.) SDM 2011. LNCS, vol. 6933, pp. 18–32. Springer, Heidelberg (2011). doi:10.1007/978-3-642-23556-6_2
Chapter Google Scholar
D’Aquin, M., Natalya, N.F.: Where to publish and find ontologies? A survey of ontology libraries. Web Semant. (online) 11, 96–111 (2012)
Article Google Scholar
Domingo-Ferrer, J., Sánchez, D., Rufian-Torrell, G.: Anonymization of nominal data based on semantic marginality. Inf. Sci. 242, 35–48 (2013)
Article Google Scholar
Kröll, M., Fukazawa, Y., Ota, J., Strohmaier, M.: Concept hierarchies of health-related human goals. In: KSEM, pp. 124–135 (2011)
Google Scholar
Lee, S., Huh, S.-Y., McNiel, R.D.: Automatic generation of concept hierarchies using WordNet. Expert Syst. Appl. 35(3), 1132–1144 (2008)
Article Google Scholar
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: efficient full-domain k-anonymity. In: International Conference on Management of Data, pp. 49–60 (2005)
Google Scholar
Lichman, M.: UCI Machine Learning Repository (2013)
Google Scholar
Martínez, S., Sánchez, D., Valls, A., Batet, M.: Privacy protection of textual attributes through a semantic-based masking method. Inf. Fusion 13, 304–314 (2012)
Article Google Scholar
Meng, L., Huang, R., Gu, J.: A review of semantic similarity measures in WordNet. Int. J. Hybrid Inf. Technol. 6(1), 1–12 (2013)
Google Scholar
Peffers, K., Tuunanen, T., Gengler, C.E., Rossi, M., Hui, W., Virtanen, V., Bragge, J.: The design science research process: a model for producing and presenting information systems research. DESRIST 24, 83–106 (2006)
Google Scholar
Portillo-Dominguez, A.O., Wang, M., Magoni, D., Perry, P., Murphy, J.: Load balancing of java applications by forecasting garbage collections. In: ISPDC (2014)
Google Scholar
Sánchez, D., Batet, M., Martínez, S., Domingo-Ferrer, J.: Semantic variance: an intuitive measure for ontology accuracy evaluation. EAAI 39, 89–99 (2015)
Google Scholar
Solé-Ribalta, A., Sánchez, D., Batet, M., Serratosa, F.: Towards the estimation of feature-based semantic similarity using multiple ontologies. Knowl. Based Syst. 55, 101–113 (2014)
Article Google Scholar
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(05), 571–588 (2002)
Article MathSciNet MATH Google Scholar
Wang, Y., Liu, W., Bell, D.: A concept hierarchy based ontology mapping approach. In: Bi, Y., Williams, M.-A. (eds.) KSEM 2010. LNCS (LNAI), vol. 6291, pp. 101–113. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15280-1_12
Chapter Google Scholar

Download references

Acknowledgments

This work was supported with the financial support of the Science Foundation Ireland grants 10/CE/I1855 and 13/RC/2094.

Author information

Authors and Affiliations

Lero@UCD, School of Computer Science, University College Dublin, Dublin, Ireland
Vanessa Ayala-Rivera, Liam Murphy & Christina Thorpe

Authors

Vanessa Ayala-Rivera
View author publications
You can also search for this author in PubMed Google Scholar
Liam Murphy
View author publications
You can also search for this author in PubMed Google Scholar
Christina Thorpe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vanessa Ayala-Rivera .

Editor information

Editors and Affiliations

University of Passau, Passau, Germany
Franz Lehner
University of Passau , Passau, Germany
Nora Fteimi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ayala-Rivera, V., Murphy, L., Thorpe, C. (2016). Automatic Construction of Generalization Hierarchies for Publishing Anonymized Data. In: Lehner, F., Fteimi, N. (eds) Knowledge Science, Engineering and Management. KSEM 2016. Lecture Notes in Computer Science(), vol 9983. Springer, Cham. https://doi.org/10.1007/978-3-319-47650-6_21

Download citation

DOI: https://doi.org/10.1007/978-3-319-47650-6_21
Published: 05 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47649-0
Online ISBN: 978-3-319-47650-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics