Skip to main content

A Survey of Privacy-Preserving Methods Across Vertically Partitioned Data

  • Chapter
Book cover Privacy-Preserving Data Mining

Part of the book series: Advances in Database Systems ((ADBS,volume 34))

The goal of data mining is to extract or “mine” knowledge from large amounts of data. However, data is often collected by several different sites. Privacy, legal and commercial concerns restrict centralized access to this data, thus derailing data mining projects. Recently, there has been growing focus on finding solutions to this problem. Several algorithms have been proposed that do distributed knowledge discovery, while providing guarantees on the non-disclosure of data. Vertical partitioning of data is an important data distribution model often found in real life. Vertical partitioning or heterogeneous distribution implies that different features of the same set of data are collected by different sites. In this chapter we survey some of the methods developed in the literature to mine vertically partitioned data without violating privacy and discuss challenges and complexities specific to vertical partitioning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Murat Kantarcioglu. A survey of Privacy-Preserving Methods across Horizontall Partitioned Data. Privacy-Preserving Data Mining: Models and Algorithms. Ed. Charu Aggarwal, Philip Yu, Springer, 2008.

    Google Scholar 

  2. Rakesh Agrawal, Alexandre Evfimievski, and Ramakrishnan Srikant. Information sharing across private databases. In Proceedings of ACM SIGMOD International Conference on Management of Data, San Diego, California, June 9-12 2003.

    Google Scholar 

  3. Daniel Barbará, Ningning Wu, and Sushil Jajodia. Detecting novel network intrusions using bayes estimators. In First SIAM International Conference on Data Mining, Chicago, Illinois, April 5-7 2001.

    Google Scholar 

  4. Vic Barnett and Toby Lewis. Outliers in Statistical Data. John Wiley and Sons, 3rd edition, 1994.

    Google Scholar 

  5. Christian Cachin. Efficient private bidding and auctions with an oblivious third party. In Proceedings of the 6th ACM conference on Computer and communications security, pages 120–127. ACM Press, 1999.

    Google Scholar 

  6. Gregory F. Cooper and Edward Herskovits. A bayesian method for the induction of probabilistic networks from data. Mach. Learn., 9(4):309–347, 1992.

    MATH  Google Scholar 

  7. Wenliang Du and Mikhail J. Atallah. Privacy-preserving statistical analysis. In Proceeding of the 17th Annual Computer Security Applications Conference, New Orleans, Louisiana, USA, December 10-14 2001.

    Google Scholar 

  8. Wenliang Du and Zhijun Zhan. Building decision tree classifier on private data. In Chris Clifton and Vladimir Estivill-Castro, editors, IEEE International Conference on Data Mining Workshop on Privacy, Security, and Data Mining, volume 14, pages 1–8, Maebashi City, Japan, December 9 2002. Australian Computer Society.

    Google Scholar 

  9. Directive 95/46/EC of the european parliament and of the council of 24 october 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data. Official Journal of the European Communities, No I.(281):31–50, October 24 1995.

    Google Scholar 

  10. Michael J. Freedman, Kobbi Nissim, and Benny Pinkas. Efficient private matching and set intersection. In Eurocrypt 2004, Interlaken, Switzerland, May 2-6 2004. International Association for Cryptologic Research (IACR).

    Google Scholar 

  11. Bart Goethals, Sven Laur, Helger Lipmaa, and Taneli Mielikäinen. On Secure Scalar Product Computation for Privacy-Preserving Data Mining. In Choonsik Park and Seongtaek Chee, editors, The 7th Annual International Conference in Information Security and Cryptology (ICISC 2004), volume 3506, pages 104–120, December 2–3, 2004.

    Google Scholar 

  12. D. M. Hawkins. Identification of Outliers. Chapman and Hall, 1st edition, 1980.

    Google Scholar 

  13. Standard for privacy of individually identifiable health information. Federal Register, 66(40), February 28 2001.

    Google Scholar 

  14. Ioannis Ioannidis, Ananth Grama, and Mikhail Atallah. A secure protocol for computing dot-products in clustered and distributed environments. In The 2002 International Conference on Parallel Processing, Vancouver, British Columbia, August 18-21 2002.

    Google Scholar 

  15. Geetha Jagannathan and Rebecca N. Wright. Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In Proceedings of the 2005 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 593–599, Chicago, IL, August 21-24 2005.

    Google Scholar 

  16. Murat Kantarcıoǧlu and Chris Clifton. Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Transactions on Knowledge and Data Engineering, 16(9):1026–1037, September 2004.

    Article  Google Scholar 

  17. Edwin M. Knorr and Raymond T. Ng. Algorithms for mining distance-based outliers in large datasets. In Proceedings of 24th International Conference on Very Large Data Bases (VLDB 1998), pages 392–403, New York City, NY, USA, August24-27 1998.

    Google Scholar 

  18. Edwin M. Knorr, Raymond T. Ng, and Vladimir Tucakov. Distance-based outliers: algorithms and applications. The VLDB Journal, 8(3–4):237–253, 2000.

    Article  Google Scholar 

  19. Aleksandar Lazarevic, Aysel Ozgur, Levent Ertoz, Jaideep Srivastava, and Vipin Kumar. A comparative study of anomaly detection schemes in network intrusion detection. In SIAM International Conference on Data Mining (2003), San Francisco, California, May 1-3 2003.

    Google Scholar 

  20. Yehuda Lindell and Benny Pinkas. Privacy preserving data mining. In Advances in Cryptology – CRYPTO 2000, pages 36–54. Springer-Verlag, August 20-24 2000.

    Google Scholar 

  21. Yehuda Lindell and Benny Pinkas. Privacy preserving data mining. Journal of Cryptology, 15(3):177–206, 2002.

    Article  MATH  MathSciNet  Google Scholar 

  22. Sridhar Ramaswamy, Rajeev Rastogi, and Kyuseok Shim. Efficient algorithms for mining outliers from large data sets. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pages 427–438. ACM Press, 2000.

    Google Scholar 

  23. Mark Shaneck, Yongdae Kim, and Vipin Kumar. Privacy preserving nearest neighbor search. In ICDM Workshops, pages 541–545. IEEE Computer Society, 2006.

    Google Scholar 

  24. Dragos Trinca and Sanguthevar Rajasekaran. Towards a collusion-resistant algebraic multi-party protocol for privacy-preserving association rule mining in vertically partitioned data. In 3rd International Workshop on Information Assurance, April11–13 2007.

    Google Scholar 

  25. Jaideep Vaidya and Chris Clifton. Privacy preserving association rule mining in vertically partitioned data. In The Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 639–644, Edmonton, Alberta, Canada, July 23-26 2002.

    Google Scholar 

  26. Jaideep Vaidya and Chris Clifton. Privacy-preserving k-means clustering over vertically partitioned data. In The Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 206–215, Washington, DC, August 24-27 2003.

    Google Scholar 

  27. Jaideep Vaidya and Chris Clifton. Privacy preserving naïve bayes classifier for vertically partitioned data. In 2004 SIAM International Conference on Data Mining, pages 522–526, Lake Buena Vista, Florida, April 22–24 2004.

    Google Scholar 

  28. Jaideep Vaidya and Chris Clifton. Privacy-preserving outlier detection. In Proceedings of the Fourth IEEE International Conference on Data Mining (ICDM’04), pages 233–240, Los Alamitos, CA, November 1 – 4 2004. IEEE Computer Society Press.

    Chapter  Google Scholar 

  29. Jaideep Vaidya and Chris Clifton. Privacy-preserving decision trees over vertically partitioned data. In The 19th Annual IFIP WG 11.3 Working Conference on Data and Applications Security, Storrs, Connecticut, August 7-10 2005. Springer.

    Google Scholar 

  30. Jaideep Vaidya and Chris Clifton. Secure set intersection cardinality with application to association rule mining. Journal of Computer Security, 13(4):593–622, November 2005.

    Google Scholar 

  31. Rebecca Wright and Zhiqiang Yang. Privacy-preserving bayesian network structure computation on distributed heterogeneous data. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, WA, August22-25 2004.

    Google Scholar 

  32. Andrew C. Yao. How to generate and exchange secrets. In Proceedings of the 27th IEEE Symposium on Foundations of Computer Science, pages 162–167. IEEE, 1986.

    Google Scholar 

  33. Sheng Zhong. Privacy-preserving algorithms for distributed mining of frequent itemsets. Information Sciences, 177(2):490–503, 2007.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Vaidya, J. (2008). A Survey of Privacy-Preserving Methods Across Vertically Partitioned Data. In: Aggarwal, C.C., Yu, P.S. (eds) Privacy-Preserving Data Mining. Advances in Database Systems, vol 34. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-70992-5_14

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-70992-5_14

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-70991-8

  • Online ISBN: 978-0-387-70992-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics