Privacy of Big Data: A Review

  • S. SangeethaEmail author
  • G. Sudha Sadasivam


Big data has become Buzzword in recent years. It is due to the fact that voluminous amount of structured, semi structured and unstructured data that is generated in the digital era. But, this huge data can be tracked and used for monetary benefits which thwart individual’s privacy. Hence numerous fruitful researches are made in privacy preservation. This book chapter lays emphases on the state-of-art privacy preserving data mining mechanisms and reviews the application of these mechanisms in big data environment.


Big data Privacy Hadoop Spark PPDM 


  1. 1.
    Bertino, Elisa and Lin, Dan and Jiang, Wei, 2008 A Survey of Quantification of Privacy Preserving Data Mining Algorithms, Privacy-Preserving Data Mining: Models and Algorithms, Springer US, 183—205.Google Scholar
  2. 2.
    Hadoop: Toddler Talk Provides Big Data Name
  3. 3.
    X. Zhang et al., Proximity-Aware Local-Recoding Anonymization with MapReduce for Scalable Big Data Privacy Preservation in Cloud, in IEEE Transactions on Computers, vol. 64, no. 8, pp. 2293–2307, Aug. 1 2015.Google Scholar
  4. 4.
    Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: simplified data processing on large clusters. Commun. ACM 51, 1 (January 2008), 107–113.Google Scholar
  5. 5.
    Hessam Zakerzadeh, Charu C. Aggarwal, and Ken Barker. 2015. Privacy-preserving big data publishing. In Proceedings of the 27th International Conference on Scientific and Statistical Database Management (SSDBM ′15), Amarnath Gupta and Susan Rathbun (Eds.). ACM, New York, NY, USA, Article 26, 11 pages.Google Scholar
  6. 6.
    Kantarcioglu, Murat, 2008, A Survey of Privacy-Preserving Methods Across Horizontally Partitioned Data, Privacy-Preserving Data Mining: Models and Algorithms, Springer US, Pages: 313--335Google Scholar
  7. 7.
    Latanya Sweeney. 2002. k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10, 5 (October 2002), 557–570.Google Scholar
  8. 8.
    Pierangela Samarati and Latanya Sweeney, Protecting Privacy when Disclosing Information: k-Anonymity and Its Enforcement through Generalization and Suppression, 1998.Google Scholar
  9. 9.
    K. LeFevre, D. J. DeWitt and R. Ramakrishnan, Mondrian Multidimensional K-Anonymity, 22nd International Conference on Data Engineering (ICDE06), 2006, pp. 25–25.Google Scholar
  10. 10.
    Hua, Ming and Pei, Jian, 2008, A Survey of Utility-based Privacy-Preserving Data Transformation Methods, Privacy-Preserving Data Mining: Models and Algorithms, Springer US, pages:207--237Google Scholar
  11. 11.
    A. Narayanan and V. Shmatikov, Robust De-anonymization of Large Sparse Datasets, 2008 IEEE Symposium on Security and Privacy (sp 2008), Oakland, CA, 2008, pp. 111–125.Google Scholar
  12. 12.
    T. M. Truta and B. Vinay, Privacy Protection: p-Sensitive k-Anonymity Property, 22nd International Conference on Data Engineering Workshops (ICDEW’06), Atlanta, GA, USA, 2006, pp. 94–94.Google Scholar
  13. 13.
    Machanavajjhala, Ashwin & Gehrke, Johannes & Kifer, Daniel & Venkitasubramaniam, Muthuramakrishnan. (2006). l-Diversity: Privacy Beyond k-Anonymity. ACM Transactions on Knowledge Discovery From Data.Google Scholar
  14. 14.
    NinghuiLi, Tiancheng Li, Suresh Venkatasubramanian, t-Closeness: Privacy Beyond k-Anonymity and ℓ -Diversity, 2007 IEEE 23rd International Conference on Data Engineering, 15–20 April 2007, Istanbul, Turkey.Google Scholar
  15. 15.
  16. 16.
    Apple announced that they will be using a technique called “Differential Privacy” (henceforth: DP) to improve the privacy of their data collection practices 2016.
  17. 17.
    Jun Wang, Shubo Liu, and Yongkai Li. 2016. A review of differential privacy in individual data release. Int. J. Distrib. Sen. Netw. 2015, Article 1 (January 2016), 1 pages.Google Scholar
  18. 18.
    Cynthia Dwork. 2006. Differential privacy. In Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part II (ICALP′06), Michele Bugliesi, Bart Preneel, Vladimiro Sassone, and Ingo Wegener (Eds.), Vol. Part II. Springer-Verlag, Berlin, Heidelberg, 1–12.Google Scholar
  19. 19.
    Microsoft differential privacy for everyone. 2015.…/Differential_Privacy_for_Everyone.pdf.
  20. 20.
    V. S. Verykios, A. K. Elmagarmid, E. Bertino, Y. Saygin and E. Dasseni, 2004, Association rule hiding, in IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 4, pp. 434–447, April 2004.Google Scholar
  21. 21.
    Nabar, Shubha U and Kenthapadi, Krishnaram and Mishra, Nina and Motwani, Rajeev, 2008, A Survey of Query Auditing Techniques for Data Privacy, Privacy-Preserving Data, Springer US, pages: 415—431.Google Scholar
  22. 22.
    Rakesh Agrawal and Ramakrishnan Srikant. 2000. Privacy-preserving data mining. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data (SIGMOD ′00). ACM, New York, NY, USA, 439–450.Google Scholar
  23. 23.
    Weiping Ge, Wei Wang, Xiaorong Li, and Baile Shi. 2005. A privacy-preserving classification mining algorithm. In Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining (PAKDD′05), Tu Bao Ho, David Cheung, and Huan Liu (Eds.). Springer-Verlag, Berlin, Heidelberg, 256–261.Google Scholar
  24. 24.
  25. 25.
    Zhiqiang, Gao & Longjun, Zhang. (2018). Privacy Preserving Data Mining on Big Data Computing Platform: Trends and Future. 491–502.Google Scholar
  26. 26.
    Indrajit Roy, Srinath T. V. Setty, Ann Kilzer, Vitaly Shmatikov, and Emmett Witchel. 2010. Airavat: security and privacy for MapReduce. In Proceedings of the 7th USENIX conference on Networked systems design and implementation (NSDI′10). USENIX Association, Berkeley, CA, USA, 20–20.Google Scholar
  27. 27.
    Blass, Erik-Oliver and Di Pietro, Roberto and Molva, Refik and Önen, Melek, 2012, PRISM – Privacy-Preserving Search in MapReduce, Privacy Enhancing Technologies, Springer Berlin Heidelberg, pages:180–200.Google Scholar
  28. 28.
    M. E. Gursoy, A. Inan, M. E. Nergiz and Y. Saygin, Privacy-Preserving Learning Analytics: Challenges and Techniques, in IEEE Transactions on Learning Technologies, vol. 10, no. 1, pp. 68–81, Jan.-March 1 2017.Google Scholar
  29. 29.
    Kangsoo Jung, Sehwa Park, and Seog Park. 2014. Hiding a Needle in a Haystack: Privacy Preserving Apriori algorithm in MapReduce Framework. In Proceedings of the First International Workshop on Privacy and Secuirty of Big Data (PSBD ′14). ACM, New York, NY, USA, 11–17.Google Scholar
  30. 30.
    Chi Lin, Zihao Song, Houbing Song, Yanhong Zhou, Yi Wang, and Guowei Wu. 2016. Differential Privacy Preserving in Big Data Analytics for Connected Health. J. Med. Syst. 40, 4 (April 2016), 1–9.Google Scholar
  31. 31.
    Abouelmehdi, Karim and Beni-Hessane, Abderrahim and Khaloufi, Hayat, 2018, Big healthcare data: preserving security and privacy, Journal of Big Data, volume 5,number 1, pages 1, 09-Jan 2018.Google Scholar
  32. 32.
    Hill K. How target figured out a teen girl was pregnant before her father did. Forbes, Inc. 2012.Google Scholar
  33. 33.
    Jain, Priyank and Gyanchandani, Manasi and Khare, Nilay, 2016, Big data privacy: a technological perspective and review, Journal of Big Data, volume 3, number 1, 26-Nov-2016, pages 25.Google Scholar
  34. 34.
    Omar Hasan, Benjamin Habegger, Lionel Brunie, Nadia Bennani, and Ernesto Damiani. 2013. A Discussion of Privacy Challenges in User Profiling with Big Data Techniques: The EEXCESS Use Case. In Proceedings of the 2013 IEEE International Congress on Big Data (BIGDATACONGRESS ′13). IEEE Computer Society, Washington, DC, USA, 25–30.Google Scholar
  35. 35.
    J. Sedayao, R. Bhardwaj and N. Gorade, Making Big Data, Privacy, and Anonymization Work Together in the Enterprise: Experiences and Issues, 2014 IEEE International Congress on Big Data, Anchorage, AK, 2014, pp. 601–607.Google Scholar
  36. 36.
    Xuyun Zhang, Chi Yang, Surya Nepal, Chang Liu, Wanchun Dou, and Jinjun Chen. 2013. A MapReduce Based Approach of Scalable Multidimensional Anonymization for Big Data Privacy Preservation on Cloud. In Proceedings of the 2013 International Conference on Cloud and Green Computing (CGC ′13). IEEE Computer Society, Washington, DC, USA, 105–112.Google Scholar
  37. 37.
    Jain, Priyank and Gyanchandani, Manasi and Khare, Nilay, 2018, Differential privacy: its technological prescriptive using big data, Journal of Big Data, volume 5, number 1, 13 Apr 2018, pages 15.Google Scholar
  38. 38.
    S. Wang et al., Big Data Privacy in Biomedical Research, in IEEE Transactions on Big Data.Google Scholar
  39. 39.
    A. Mehmood, I. Natgunanathan, Y. Xiang, G. Hua and S. Guo, Protection of Big Data Privacy, in IEEE Access, vol. 4, pp. 1821–1834, 2016.Google Scholar
  40. 40.
    Matturdi, Bardi & Zhou, Xianwei & Li, Shuai & Lin, Fuhong. (2014). Big Data security and privacy: A review. China Communications. January 2014, 11(14), pages: 135–145.Google Scholar
  41. 41.
    C. Perera, R. Ranjan, L. Wang, S. U. Khan and A. Y. Zomaya, Big Data Privacy in the Internet of Things Era, in IT Professional, vol. 17, no. 3, pp. 32–39, May–June 2015.Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Information TechnologyPSG College of TechnologyCoimbatoreIndia
  2. 2.Department of Computer Science and EngineeringPSG College of TechnologyCoimbatoreIndia

Personalised recommendations