Skip to main content

Automatic Construction of Generalization Hierarchies for Publishing Anonymized Data

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9983))

Abstract

Concept hierarchies are widely used in multiple fields to carry out data analysis. In data privacy, they are known as Value Generalization Hierarchies (VGHs), and are used by generalization algorithms to dictate the data anonymization. Thus, their proper specification is critical to obtain anonymized data of good quality. The creation and evaluation of VGHs require expert knowledge and a significant amount of manual effort, making these tasks highly error-prone and time-consuming. In this paper we present AIKA, a knowledge-based framework to automatically construct and evaluate VGHs for the anonymization of categorical data. AIKA integrates ontologies to objectively create and evaluate VGHs. It also implements a multi-dimensional reward function to tailor the VGH evaluation to different use cases. Our experiments show that AIKA improved the creation of VGHs by generating VGHs of good quality in less time than when manually done. Results also showed how the reward function properly captures the desired VGH properties.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Chicago Homicides. https://data.cityofchicago.org

  2. Insurance. https://github.com/ucd-pel/Datasets/tree/master/Insurance

  3. UTD ToolBox. http://cs.utdallas.edu/dspl/cgi-bin/toolbox/

  4. WS4J library. https://code.google.com/p/ws4j/

  5. Ayala-Rivera, V., McDonagh, P., Cerqueus, T., Murphy, L.: A systematic comparison and evaluation of k -anonymization algorithms for practitioners. Trans. Data Priv. 7(3), 337–370 (2014)

    MathSciNet  Google Scholar 

  6. Ayala-Rivera, V., McDonagh, P., Cerqueus, T., Murphy, L.: Ontology-based quality evaluation of value generalization hierarchies for data anonymization. In: PSD (2014)

    Google Scholar 

  7. Banerjee, S., Pedersen, T.: An adapted lesk algorithm for word sense disambiguation using WordNet. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, pp. 136–145. Springer, Heidelberg (2002). doi:10.1007/3-540-45715-1_11

    Chapter  Google Scholar 

  8. Campan, A., Cooper, N., Truta, T.M.: On-the-fly generalization hierarchies for numerical attributes revisited. In: Jonker, W., Petković, M. (eds.) SDM 2011. LNCS, vol. 6933, pp. 18–32. Springer, Heidelberg (2011). doi:10.1007/978-3-642-23556-6_2

    Chapter  Google Scholar 

  9. D’Aquin, M., Natalya, N.F.: Where to publish and find ontologies? A survey of ontology libraries. Web Semant. (online) 11, 96–111 (2012)

    Article  Google Scholar 

  10. Domingo-Ferrer, J., Sánchez, D., Rufian-Torrell, G.: Anonymization of nominal data based on semantic marginality. Inf. Sci. 242, 35–48 (2013)

    Article  Google Scholar 

  11. Kröll, M., Fukazawa, Y., Ota, J., Strohmaier, M.: Concept hierarchies of health-related human goals. In: KSEM, pp. 124–135 (2011)

    Google Scholar 

  12. Lee, S., Huh, S.-Y., McNiel, R.D.: Automatic generation of concept hierarchies using WordNet. Expert Syst. Appl. 35(3), 1132–1144 (2008)

    Article  Google Scholar 

  13. LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: efficient full-domain k-anonymity. In: International Conference on Management of Data, pp. 49–60 (2005)

    Google Scholar 

  14. Lichman, M.: UCI Machine Learning Repository (2013)

    Google Scholar 

  15. Martínez, S., Sánchez, D., Valls, A., Batet, M.: Privacy protection of textual attributes through a semantic-based masking method. Inf. Fusion 13, 304–314 (2012)

    Article  Google Scholar 

  16. Meng, L., Huang, R., Gu, J.: A review of semantic similarity measures in WordNet. Int. J. Hybrid Inf. Technol. 6(1), 1–12 (2013)

    Google Scholar 

  17. Peffers, K., Tuunanen, T., Gengler, C.E., Rossi, M., Hui, W., Virtanen, V., Bragge, J.: The design science research process: a model for producing and presenting information systems research. DESRIST 24, 83–106 (2006)

    Google Scholar 

  18. Portillo-Dominguez, A.O., Wang, M., Magoni, D., Perry, P., Murphy, J.: Load balancing of java applications by forecasting garbage collections. In: ISPDC (2014)

    Google Scholar 

  19. Sánchez, D., Batet, M., Martínez, S., Domingo-Ferrer, J.: Semantic variance: an intuitive measure for ontology accuracy evaluation. EAAI 39, 89–99 (2015)

    Google Scholar 

  20. Solé-Ribalta, A., Sánchez, D., Batet, M., Serratosa, F.: Towards the estimation of feature-based semantic similarity using multiple ontologies. Knowl. Based Syst. 55, 101–113 (2014)

    Article  Google Scholar 

  21. Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(05), 571–588 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  22. Wang, Y., Liu, W., Bell, D.: A concept hierarchy based ontology mapping approach. In: Bi, Y., Williams, M.-A. (eds.) KSEM 2010. LNCS (LNAI), vol. 6291, pp. 101–113. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15280-1_12

    Chapter  Google Scholar 

Download references

Acknowledgments

This work was supported with the financial support of the Science Foundation Ireland grants 10/CE/I1855 and 13/RC/2094.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vanessa Ayala-Rivera .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Ayala-Rivera, V., Murphy, L., Thorpe, C. (2016). Automatic Construction of Generalization Hierarchies for Publishing Anonymized Data. In: Lehner, F., Fteimi, N. (eds) Knowledge Science, Engineering and Management. KSEM 2016. Lecture Notes in Computer Science(), vol 9983. Springer, Cham. https://doi.org/10.1007/978-3-319-47650-6_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-47650-6_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-47649-0

  • Online ISBN: 978-3-319-47650-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics