Skip to main content

A Survey of Privacy-Preserving Methods Across Horizontally Partitioned Data

  • Chapter

Part of the book series: Advances in Database Systems ((ADBS,volume 34))

Data mining can extract important knowledge from large data collections, but sometimes these collections are split among various parties. Data warehousing, bringing data from multiple sources under a single authority, increases risk of privacy violations. Furthermore, privacy concerns may prevent the parties from directly sharing even some meta-data.

Distributed data mining and processing provide a means to address this issue, particularly if queries are processed in a way that avoids the disclosure of any information beyond the final result. This chapter describes methods to mine horizontally partitioned data without violating privacy and discusses how to use the data mining results in a privacy-preserving way. The methods described here incorporate cryptographic techniques to minimize the information shared, while adding as little as possible overhead to the mining and processing task.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chang, Yan-Cheng and Lu, Chi-Jen (2001). Oblivious polynomial evaluation and oblivious neural learning. Lecture Notes in Computer Science, 2248:369+.

    Article  MathSciNet  Google Scholar 

  2. Cramer, R., Gilboa, Niv, Naor, Moni, Pinkas, Benny, and Poupard, G. (2000). Oblivious Polynomial Evaluation. Can be found in the Privacy Preserving Data Mining paper by Naor and Pinkas.

    Google Scholar 

  3. Cramer, Ronald, Damgård, Ivan, and Nielsen, Jesper B. (2001). Multiparty computation from threshold homomorphic encryption. Lecture Notes in Computer Science, 2045:280+.

    Article  Google Scholar 

  4. Damgard, I., Jurik, M., and Nielsen, J. (2003). A generalization of paillier’s public-key system with applications to electronic voting.

    Google Scholar 

  5. Du, Wenliang and Atallah, Mikhail J. (2001). Privacy-preserving statistical analysis. In Proceeding of the 17th Annual Computer Security Applications Conference, New Orleans, Louisiana, USA.

    Google Scholar 

  6. Du, Wenliang and Zhan, Zhijun (2002). Building decision tree classifier on private data. In Clifton, Chris and Estivill-Castro, Vladimir, editors, IEEE International Conference on Data Mining Workshop on Privacy, Security, and Data Mining, volume 14, pages 1–8, Maebashi City, Japan. Australian Computer Society.

    Google Scholar 

  7. Feigenbaum, Joan, Ishai, Yuval, Malkin, Tal, Nissim, Kobbi, Strauss, Martin J., and Wright, Rebecca N. (2006). Secure multiparty computation of approximations. ACM Trans. Algorithms, 2(3):435–472.

    Article  MathSciNet  Google Scholar 

  8. Feingold, Mr., Corzine, Mr., Wyden, Mr., and Nelson, Mr. (2003). Data Mining Moratorium Act of 2003. U.S. Senate Bill (proposed).

    Google Scholar 

  9. Freedman, Michael J., Nissim, Kobbi, and Pinkas, Benny (2004). Efficient private matching and set intersection. In Eurocrypt 2004, Interlaken, Switzerland. International Association for Cryptologic Research (IACR).

    Google Scholar 

  10. Friedman, Arik, Wolff, Ran, and Schuster, Assaf (to appear). Providing k-anonymity in data mining. VLDB Journal.

    Google Scholar 

  11. Fukunaga, Keinosuke (1990). Introduction to Statistical Pattern Recognition. Academic Press, San Diego, CA.

    MATH  Google Scholar 

  12. Goethals, Bart, Laur, Sven, Lipmaa, Helger, and Mielikäinen, Taneli (2004). On Secure Scalar Product Computation for Privacy-Preserving Data Mining. In Park, Choonsik and Chee, Seongtaek, editors, The 7th Annual International Conference in Information Security and Cryptology (ICISC 2004), volume 3506, pages 104–120.

    Google Scholar 

  13. Goldreich, Oded (2004). The Foundations of Cryptography, volume 2, chapter General Cryptographic Protocols. Cambridge University Press.

    Google Scholar 

  14. Ioannidis, Ioannis, Grama, Ananth, and Atallah, Mikhail (2002). A secure protocol for computing dot-products in clustered and distributed environments. In The 2002 International Conference on Parallel Processing, Vancouver, British Columbia.

    Google Scholar 

  15. Jagannathan, Geetha and Wright, Rebecca N. (2005). Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In Proceedings of the 2005 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 593–599, Chicago, IL.

    Google Scholar 

  16. Jiang, Wei, Clifton, Chris, and Kantarcioglu, Murat (To appear.). Transforming semi-honest protocols to ensure accountability. Data and Knowledge Engineering.

    Google Scholar 

  17. Kantarcioglu, Murat and Kardes, Onur (2006). Privacy-preserving data mining in malicious model. Technical Report CS-2006-06, Stevens Institute of Technology.

    Google Scholar 

  18. Kantarcioglu, Murat and Vaidya, Jaideep (2003). Privacy preserving naive bayes classifier for horizontally partitioned data. In the Workshop on Privacy Preserving Data Mining held in association with The Third IEEE International Conference on Data Mining, Melbourne, FL.

    Google Scholar 

  19. Kantarcıoğlu, Murat and Clifton, Chris (2002). Privacy-preserving distributed mining of association rules on horizontally partitioned data. In The ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD’02), pages 24–31, Madison, Wisconsin.

    Google Scholar 

  20. Kantarcıoğlu, Murat and Clifton, Chris (2004a). Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE TKDE, 16(9):1026–1037.

    Google Scholar 

  21. Kantarcıoğlu, Murat and Clifton, Chris (2004b). Privately computing a distributed k-nn classifier. In Boulicaut, Jean-Franois, Esposito, Floriana, Giannotti, Fosca, and Pedreschi, Dino, editors, PKDD2004: 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, pages 279–290, Pisa, Italy.

    Google Scholar 

  22. Kantarcıoğlu, Murat, Jin, Jiashun, and Clifton, Chris (2004). When do data mining results violate privacy? In Proceedings of the 2004 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 599–604, Seattle, WA.

    Google Scholar 

  23. Kissner, L. and Song, D. (2005). Privacy-preserving set operations. In Advances in Cryptology — CRYPTO 2005.

    Google Scholar 

  24. Lin, Xiaodong, Clifton, Chris, and Zhu, Michael (2005). Privacy preserving clustering with distributed EM mixture modeling. Knowledge and Information Systems, 8(1):68–81.

    Article  Google Scholar 

  25. Lindell, Yehuda and Pinkas, Benny (2000). Privacy preserving data mining. In Advances in Cryptology – CRYPTO 2000, pages 36–54. Springer-Verlag.

    Google Scholar 

  26. Lindell, Yehuda and Pinkas, Benny (2002). Privacy preserving data mining. Journal of Cryptology, 15(3):177–206.

    Article  MATH  MathSciNet  Google Scholar 

  27. Mitchell, Tom (1997). Machine Learning. McGraw-Hill Science/Engineering/Math, 1st edition.

    Google Scholar 

  28. Naor, Moni and Pinkas, Benny (1999). Oblivious transfer and polynomial evaluation. In Proceedings of the Thirty-first Annual ACM Symposium on Theory of Computing, pages 245–254, Atlanta, Georgia, United States. ACM Press.

    Chapter  Google Scholar 

  29. Paillier, P. (1999). Public key cryptosystems based on composite degree residuosity classes. In Advances in Cryptology - Eurocrypt ’99 Proceedings, LNCS 1592, pages 223–238. Springer-Verlag.

    Google Scholar 

  30. Perry, John M. (2005). Statement of john m. perry, president and ceo, cardsystems solutions, inc. before the united states house of representatives subcommittee on oversight and investigations of the committee on financial services. http://financialservices.house.gov/hearings.asp?formmode=detail&hearing=407&comm=4.

  31. Vaidya, Jaideep and Clifton, Chris (2002). Privacy preserving association rule mining in vertically partitioned data. In The Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 639–644, Edmonton, Alberta, Canada.

    Google Scholar 

  32. Vaidya, Jaideep and Clifton, Chris (2005). Secure set intersection cardinality with application to association rule mining. Journal of Computer Security, 13(4).

    Google Scholar 

  33. Yao, Andrew C. (1986). How to generate and exchange secrets. In Proceedings of the 27th IEEE Symposium on Foundations of Computer Science, pages 162–167. IEEE.

    Google Scholar 

  34. Yu, Hwanjo, Jiang, Xiaoqian, and Vaidya, Jaideep (2006). Privacy-preserving svm using nonlinear kernels on horizontally partitioned data. In SAC ’06: Proceedings of the 2006 ACM symposium on Applied computing, pages 603–610, New York, NY, USA. ACM Press.

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Kantarcioglu, M. (2008). A Survey of Privacy-Preserving Methods Across Horizontally Partitioned Data. In: Aggarwal, C.C., Yu, P.S. (eds) Privacy-Preserving Data Mining. Advances in Database Systems, vol 34. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-70992-5_13

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-70992-5_13

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-70991-8

  • Online ISBN: 978-0-387-70992-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics