Improved k-Anonymity Privacy-Preserving Algorithm Using Madhya Pradesh State Election Commission Big Data
Modern technology produces a large number of public and private data sets that make the task of securing personal data unavoidable. Initially, priority was given to securing data for organizations and companies, but nowadays it is also necessary to provide security for personal data. Therefore, to achieve information security, the protection of individual data is critical. Data anonymization is a technique for preserving privacy in data publishing, which enables the publication of practically useful information for data mining while preserving the confidentiality of the information of the individual. This chapter presents the implementation of data anonymization using a proposed improved k-anonymity algorithm applied to a large candidate election data set acquired from the Madhya Pradesh (MP, India) State Election Commission. Along with greater privacy, the algorithm is executed in less time than the traditional k-anonymity algorithm, and as such, it is able to satisfy the data protection needs of the current big data environment.
KeywordsBig data Anonymization Generalization Suppression l-Diversity t-Closeness Differential privacy
We are grateful to the Madhya Pradesh State Election Commission, India, for their ardent and constant support and for providing us with the real-time candidate big data needed for the research paper.
- 1.Han, Y., B. Jiang, B. Zhou, Y. Tao, J. Pei, and Y. Jia. 2009. Continuous privacy preserving publishing of data streams. In EDBT.Google Scholar
- 3.Hessam Zakerdah, C.C., and K.B. Aggarwal. 2015. Privacy-preserving big data publishing. La Jolla: ACM.Google Scholar
- 4.G. Acampora, et al. 2015. Data analytics for pervasive health. In: Healthcare data analytics, 533–576.Google Scholar
- 6.Srikant, R., and R. Agrawal. 2000. Privacy-preserving data mining. In SIGMOD.Google Scholar
- 7.Gyanchandani, Manasi, Priyank Jain, Nilay Khare. Big data privacy: A technological perspective and review, Journal of Big Data, 3: 2016. ISSN 2196-1115.Google Scholar
- 8.Ton, A, M. Saravanan. Ericsson research (Online). http://www.ericsson.com/research-blog/data-knowledge/big-data-privacy-preservation/2015.
- 10.Jain, P., A.S. Umesh. 2013. Privacy preserving processing of data decision tree based on Singular Value Decomposition and sample selection. In 2013 9th international conference on information assurance and security (IAS), 91–95, Gammarth.Google Scholar
- 11.Dwork, Cynthia, and Aaron Roth. 2014. The algorithmic foundations of differential privacy.Google Scholar
- 12.Wang, Yadong, and Guang Li. 2011. Privacy-preserving data mining based on sample selection and singular value decomposition. In IEEE international conference on information services and internet computing, 298–301.Google Scholar
- 13.Huy, Xueyang, Jianguo Yaoy, Yu Dengy, Lei Chenz, Mingxuan Yuan, Qiang Yangz, et al. Differential privacy in telco big data platform. In Proceedings of the VLDB Endowment, Vol. 8, No. 12. Copyright 2015 VLDB Endowment 21508097/15/08.Google Scholar
- 14.Sokolova, M., and S. Matwin. 2015. Personal privacy protection in time of big data. Berlin: Springer.Google Scholar
- 16.Mohammadian, E., M. Noferesti, R. Jalili. 2014. FAST: Fast anonymization of big data streams. In ACM proceedings of the 2014 international conference on big data science and computing, article 1.Google Scholar
- 20.Gyanchandani, Manasi, Priyank Jain, Nilay Khare Direndra Pratap Singh, and Lokini Rajesh. 2017. A survey on big data privacy using hadoop architecture. International Journal of Computer Science and Network Security 17(2).Google Scholar
- 24.Kifle Russom, Yohannes. 2013. Privacy preserving for big data analysis. Master thesis, University of Stavanger.Google Scholar