Skip to main content

Privacy of Big Data: A Review

  • Chapter
  • First Online:
Handbook of Big Data and IoT Security

Abstract

Big data has become Buzzword in recent years. It is due to the fact that voluminous amount of structured, semi structured and unstructured data that is generated in the digital era. But, this huge data can be tracked and used for monetary benefits which thwart individual’s privacy. Hence numerous fruitful researches are made in privacy preservation. This book chapter lays emphases on the state-of-art privacy preserving data mining mechanisms and reviews the application of these mechanisms in big data environment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bertino, Elisa and Lin, Dan and Jiang, Wei, 2008 A Survey of Quantification of Privacy Preserving Data Mining Algorithms, Privacy-Preserving Data Mining: Models and Algorithms, Springer US, 183—205.

    Google Scholar 

  2. Hadoop: Toddler Talk Provides Big Data Name https://www.cnbc.com/id/100769719#

  3. X. Zhang et al., Proximity-Aware Local-Recoding Anonymization with MapReduce for Scalable Big Data Privacy Preservation in Cloud, in IEEE Transactions on Computers, vol. 64, no. 8, pp. 2293–2307, Aug. 1 2015.

    Google Scholar 

  4. Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: simplified data processing on large clusters. Commun. ACM 51, 1 (January 2008), 107–113.

    Google Scholar 

  5. Hessam Zakerzadeh, Charu C. Aggarwal, and Ken Barker. 2015. Privacy-preserving big data publishing. In Proceedings of the 27th International Conference on Scientific and Statistical Database Management (SSDBM ′15), Amarnath Gupta and Susan Rathbun (Eds.). ACM, New York, NY, USA, Article 26, 11 pages.

    Google Scholar 

  6. Kantarcioglu, Murat, 2008, A Survey of Privacy-Preserving Methods Across Horizontally Partitioned Data, Privacy-Preserving Data Mining: Models and Algorithms, Springer US, Pages: 313--335

    Google Scholar 

  7. Latanya Sweeney. 2002. k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10, 5 (October 2002), 557–570.

    Google Scholar 

  8. Pierangela Samarati and Latanya Sweeney, Protecting Privacy when Disclosing Information: k-Anonymity and Its Enforcement through Generalization and Suppression, 1998.

    Google Scholar 

  9. K. LeFevre, D. J. DeWitt and R. Ramakrishnan, Mondrian Multidimensional K-Anonymity, 22nd International Conference on Data Engineering (ICDE06), 2006, pp. 25–25.

    Google Scholar 

  10. Hua, Ming and Pei, Jian, 2008, A Survey of Utility-based Privacy-Preserving Data Transformation Methods, Privacy-Preserving Data Mining: Models and Algorithms, Springer US, pages:207--237

    Google Scholar 

  11. A. Narayanan and V. Shmatikov, Robust De-anonymization of Large Sparse Datasets, 2008 IEEE Symposium on Security and Privacy (sp 2008), Oakland, CA, 2008, pp. 111–125.

    Google Scholar 

  12. T. M. Truta and B. Vinay, Privacy Protection: p-Sensitive k-Anonymity Property, 22nd International Conference on Data Engineering Workshops (ICDEW’06), Atlanta, GA, USA, 2006, pp. 94–94.

    Google Scholar 

  13. Machanavajjhala, Ashwin & Gehrke, Johannes & Kifer, Daniel & Venkitasubramaniam, Muthuramakrishnan. (2006). l-Diversity: Privacy Beyond k-Anonymity. ACM Transactions on Knowledge Discovery From Data.

    Google Scholar 

  14. NinghuiLi, Tiancheng Li, Suresh Venkatasubramanian, t-Closeness: Privacy Beyond k-Anonymity and ℓ -Diversity, 2007 IEEE 23rd International Conference on Data Engineering, 15–20 April 2007, Istanbul, Turkey.

    Google Scholar 

  15. Differential privacy https://en.wikipedia.org/wiki/Differential_privacy.

  16. Apple announced that they will be using a technique called “Differential Privacy” (henceforth: DP) to improve the privacy of their data collection practices 2016. https://blog.cryptograhyengineering.com/2016/06/15/what-is-differential-privacy/.

  17. Jun Wang, Shubo Liu, and Yongkai Li. 2016. A review of differential privacy in individual data release. Int. J. Distrib. Sen. Netw. 2015, Article 1 (January 2016), 1 pages.

    Google Scholar 

  18. Cynthia Dwork. 2006. Differential privacy. In Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part II (ICALP′06), Michele Bugliesi, Bart Preneel, Vladimiro Sassone, and Ingo Wegener (Eds.), Vol. Part II. Springer-Verlag, Berlin, Heidelberg, 1–12.

    Google Scholar 

  19. Microsoft differential privacy for everyone. 2015. http://download.microsoft.com/…/Differential_Privacy_for_Everyone.pdf.

  20. V. S. Verykios, A. K. Elmagarmid, E. Bertino, Y. Saygin and E. Dasseni, 2004, Association rule hiding, in IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 4, pp. 434–447, April 2004.

    Google Scholar 

  21. Nabar, Shubha U and Kenthapadi, Krishnaram and Mishra, Nina and Motwani, Rajeev, 2008, A Survey of Query Auditing Techniques for Data Privacy, Privacy-Preserving Data, Springer US, pages: 415—431.

    Google Scholar 

  22. Rakesh Agrawal and Ramakrishnan Srikant. 2000. Privacy-preserving data mining. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data (SIGMOD ′00). ACM, New York, NY, USA, 439–450.

    Google Scholar 

  23. Weiping Ge, Wei Wang, Xiaorong Li, and Baile Shi. 2005. A privacy-preserving classification mining algorithm. In Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining (PAKDD′05), Tu Bao Ho, David Cheung, and Huan Liu (Eds.). Springer-Verlag, Berlin, Heidelberg, 256–261.

    Google Scholar 

  24. Hadoop Tutorials. 2012. https://developer.yahoo.com/hadoop/tutorial.

  25. Zhiqiang, Gao & Longjun, Zhang. (2018). Privacy Preserving Data Mining on Big Data Computing Platform: Trends and Future. 491–502.

    Google Scholar 

  26. Indrajit Roy, Srinath T. V. Setty, Ann Kilzer, Vitaly Shmatikov, and Emmett Witchel. 2010. Airavat: security and privacy for MapReduce. In Proceedings of the 7th USENIX conference on Networked systems design and implementation (NSDI′10). USENIX Association, Berkeley, CA, USA, 20–20.

    Google Scholar 

  27. Blass, Erik-Oliver and Di Pietro, Roberto and Molva, Refik and Önen, Melek, 2012, PRISM – Privacy-Preserving Search in MapReduce, Privacy Enhancing Technologies, Springer Berlin Heidelberg, pages:180–200.

    Google Scholar 

  28. M. E. Gursoy, A. Inan, M. E. Nergiz and Y. Saygin, Privacy-Preserving Learning Analytics: Challenges and Techniques, in IEEE Transactions on Learning Technologies, vol. 10, no. 1, pp. 68–81, Jan.-March 1 2017.

    Google Scholar 

  29. Kangsoo Jung, Sehwa Park, and Seog Park. 2014. Hiding a Needle in a Haystack: Privacy Preserving Apriori algorithm in MapReduce Framework. In Proceedings of the First International Workshop on Privacy and Secuirty of Big Data (PSBD ′14). ACM, New York, NY, USA, 11–17.

    Google Scholar 

  30. Chi Lin, Zihao Song, Houbing Song, Yanhong Zhou, Yi Wang, and Guowei Wu. 2016. Differential Privacy Preserving in Big Data Analytics for Connected Health. J. Med. Syst. 40, 4 (April 2016), 1–9.

    Google Scholar 

  31. Abouelmehdi, Karim and Beni-Hessane, Abderrahim and Khaloufi, Hayat, 2018, Big healthcare data: preserving security and privacy, Journal of Big Data, volume 5,number 1, pages 1, 09-Jan 2018.

    Google Scholar 

  32. Hill K. How target figured out a teen girl was pregnant before her father did. Forbes, Inc. 2012.

    Google Scholar 

  33. Jain, Priyank and Gyanchandani, Manasi and Khare, Nilay, 2016, Big data privacy: a technological perspective and review, Journal of Big Data, volume 3, number 1, 26-Nov-2016, pages 25.

    Google Scholar 

  34. Omar Hasan, Benjamin Habegger, Lionel Brunie, Nadia Bennani, and Ernesto Damiani. 2013. A Discussion of Privacy Challenges in User Profiling with Big Data Techniques: The EEXCESS Use Case. In Proceedings of the 2013 IEEE International Congress on Big Data (BIGDATACONGRESS ′13). IEEE Computer Society, Washington, DC, USA, 25–30.

    Google Scholar 

  35. J. Sedayao, R. Bhardwaj and N. Gorade, Making Big Data, Privacy, and Anonymization Work Together in the Enterprise: Experiences and Issues, 2014 IEEE International Congress on Big Data, Anchorage, AK, 2014, pp. 601–607.

    Google Scholar 

  36. Xuyun Zhang, Chi Yang, Surya Nepal, Chang Liu, Wanchun Dou, and Jinjun Chen. 2013. A MapReduce Based Approach of Scalable Multidimensional Anonymization for Big Data Privacy Preservation on Cloud. In Proceedings of the 2013 International Conference on Cloud and Green Computing (CGC ′13). IEEE Computer Society, Washington, DC, USA, 105–112.

    Google Scholar 

  37. Jain, Priyank and Gyanchandani, Manasi and Khare, Nilay, 2018, Differential privacy: its technological prescriptive using big data, Journal of Big Data, volume 5, number 1, 13 Apr 2018, pages 15.

    Google Scholar 

  38. S. Wang et al., Big Data Privacy in Biomedical Research, in IEEE Transactions on Big Data.

    Google Scholar 

  39. A. Mehmood, I. Natgunanathan, Y. Xiang, G. Hua and S. Guo, Protection of Big Data Privacy, in IEEE Access, vol. 4, pp. 1821–1834, 2016.

    Google Scholar 

  40. Matturdi, Bardi & Zhou, Xianwei & Li, Shuai & Lin, Fuhong. (2014). Big Data security and privacy: A review. China Communications. January 2014, 11(14), pages: 135–145.

    Google Scholar 

  41. C. Perera, R. Ranjan, L. Wang, S. U. Khan and A. Y. Zomaya, Big Data Privacy in the Internet of Things Era, in IT Professional, vol. 17, no. 3, pp. 32–39, May–June 2015.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Sangeetha .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Sangeetha, S., Sudha Sadasivam, G. (2019). Privacy of Big Data: A Review. In: Dehghantanha, A., Choo, KK. (eds) Handbook of Big Data and IoT Security. Springer, Cham. https://doi.org/10.1007/978-3-030-10543-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-10543-3_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-10542-6

  • Online ISBN: 978-3-030-10543-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics