Abstract
Privacy preservation is a substantial concern for the organizations that publish/share personal data for informal analysis. Several anonymization algorithms such as generalization and Bucketization are developed as a solution to this Privacy Preserving Data Publishing (PPDP). Latest research has shown that generalization loses significant amount of information, particularly for high dimensional data. However, Bucketization does not prevent membership disclosure. In this paper, we propose a novel approach that makes use of Information Gain of the attributes with respect to sensitive attributes, which gives the effectiveness of an attribute in classifying the data, which is two-way association among attributes. We show that our approach preserves better data utility and has lesser complexity than earlier techniques. Our proposed technique is theoretically analyzed, and mathematical analysis outstrips past works with sufficient experiments.
References
Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(5), 557–570 (2002)
Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001). doi:https://doi.org/10.1109/69.971193
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: efficient full-domain K-anonymity. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of data (SIGMOD ’05). ACM, New York, NY, USA, pp. 49–60 (2005)
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: 22nd International Conference on Data Engineering (ICDE’06), pp. 25–25. doi:https://doi.org/10.1109/ICDE.2006.101 (2006)
Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: â„“-diversity: privacy beyond k-anonymity. In: Proceedings of International Conference Data Engineering (ICDE), p. 24 (2006)
Domingo-Ferrer, J., Torra, V.: A critique of k-anonymity and some of its enhancements. In: Proceedings of the 3rd International Conference on Availability, Reliability and Security (ARES), pp. 990–993 (2008)
Ninghui, L., Tiancheng, L., Venkatasubramanian, S.: t-closeness: privacy beyond k-anonymity and ℓ-diversity. In: Proceedings—International Conference on Data Engineering, pp. 106–115 (2007)
Xiao, X., Tao, Y.: m-invariance: towards privacy preserving re-publication of dynamic datasets. In: ACM SIGMOD International Conference on Management of Data, pp. 689–700 (2007)
Xiao, X., Tao, Y.: Personalized privacy preservation. In: Proceedings of ACM International Conference on Management of Data (SIGMOD), Chicago, IL (2006)
Li, T., Li, N., Zhang, J., Molloy, I.: Slicing: a new approach for privacy preserving data publishing. IEEE Trans. Knowl. Data Eng. 24(3), 561–574 (2012)
Aggarwal, C.: On k-anonymity and the curse of dimensionality. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 901–909 (2005)
Kifer, D., Gehrke, J.: Injecting utility into anonymized data sets. In: Proceedings of ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 217–228 (2006)
Nergiz, M.E., Atzori, M., Clifton, C.: Hiding the presence of individuals from shared databases. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 665–676 (2007)
Kabir, M.E., Wang, H., Bertino, E.: Efficient systematic clustering method for k-anonymization. Acta Inf. 48(1), 51–66 (2011). doi:https://doi.org/10.1007/s00236-010-0131-6
Pavan, R., Bhaladhare, A.N.D., Devesh, C.: Jinwala: novel approaches for privacy preserving data mining in k-anonymity model. J. Inf. Sci. Eng. 32(1), 63–78 (2016)
Tao, Y., Xiao, X., Li, J., Zhang, D.: On anti-corruption privacy-preserving publication. In: Proceedings of ICDE 08, Cancun, April 7–12, pp. 725–734. Washington, DC, USA (2008)
Zhu, H., Tian, S., Lü, K.: Privacy-preserving data publication with features of independent ℓ-diversity. Comput. J. 58(4), 549–571 (2015)
Fengli, Z., Yijing, B.: ARM-based privacy preserving for medical data publishing. In: Cloud Computing and Security: First International Conference, ICCCS 2015, Nanjing, China, August 13–15. doi:https://doi.org/10.1007/978-3-319-27051-7_6 (2015)
Sánchez, D., Batet, M., Viejo, A.: Utility-preserving privacy protection of textual healthcare documents. J. Biomed. Inf. 52, 189–198 (2014). doi:https://doi.org/10.1016/j.jbi.2014.06.008
Fan, L., Jin, H.: A practical framework for privacy-preserving data analytics. In: Proceedings of the 24th International Conference on World Wide Web (WWW ’15), pp. 311–321. ACM, New York (2015)
Zaman, N.K., Obimbo, C., Dara, R.A.: A novel differential privacy approach that enhances classification accuracy. In: Desai, E. (ed.) Proceedings of the Ninth International C* Conference on Computer Science and Software Engineering (C3S2E ’16), pp. 79–84. ACM, New York. doi:http://dx.doi.org/10.1145/2948992.2949027 (2016)
Weng, L., Amsaleg, L., Furon, T.: Privacy-preserving outsourced media search. IEEE Trans. Knowl. Data Eng. 28(10), 2738–2751 (2016). doi:https://doi.org/10.1109/TKDE.2016.2587258
Lichman, M.: UCI Machine Learning Repository. (http://archive.ics.uci.edu/ml). Irvine, CA: University of California, School of Information and Computer Science (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ashoka, K., Poornima, B. (2018). Mutual Correlation-based Optimal Slicing for Preserving Privacy in Data Publishing. In: Satapathy, S., Bhateja, V., Das, S. (eds) Smart Computing and Informatics . Smart Innovation, Systems and Technologies, vol 77. Springer, Singapore. https://doi.org/10.1007/978-981-10-5544-7_58
Download citation
DOI: https://doi.org/10.1007/978-981-10-5544-7_58
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-5543-0
Online ISBN: 978-981-10-5544-7
eBook Packages: EngineeringEngineering (R0)