A Survey of Privacy-Preserving Methods Across Horizontally Partitioned Data

  • Murat Kantarcioglu
Part of the Advances in Database Systems book series (ADBS, volume 34)

Data mining can extract important knowledge from large data collections, but sometimes these collections are split among various parties. Data warehousing, bringing data from multiple sources under a single authority, increases risk of privacy violations. Furthermore, privacy concerns may prevent the parties from directly sharing even some meta-data.

Distributed data mining and processing provide a means to address this issue, particularly if queries are processed in a way that avoids the disclosure of any information beyond the final result. This chapter describes methods to mine horizontally partitioned data without violating privacy and discusses how to use the data mining results in a privacy-preserving way. The methods described here incorporate cryptographic techniques to minimize the information shared, while adding as little as possible overhead to the mining and processing task.


Privacy distributed data mining horizontally partitioned data and homomorphic encryption 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chang, Yan-Cheng and Lu, Chi-Jen (2001). Oblivious polynomial evaluation and oblivious neural learning. Lecture Notes in Computer Science, 2248:369+.CrossRefMathSciNetGoogle Scholar
  2. 2.
    Cramer, R., Gilboa, Niv, Naor, Moni, Pinkas, Benny, and Poupard, G. (2000). Oblivious Polynomial Evaluation. Can be found in the Privacy Preserving Data Mining paper by Naor and Pinkas.Google Scholar
  3. 3.
    Cramer, Ronald, Damgård, Ivan, and Nielsen, Jesper B. (2001). Multiparty computation from threshold homomorphic encryption. Lecture Notes in Computer Science, 2045:280+.CrossRefGoogle Scholar
  4. 4.
    Damgard, I., Jurik, M., and Nielsen, J. (2003). A generalization of paillier’s public-key system with applications to electronic voting.Google Scholar
  5. 5.
    Du, Wenliang and Atallah, Mikhail J. (2001). Privacy-preserving statistical analysis. In Proceeding of the 17th Annual Computer Security Applications Conference, New Orleans, Louisiana, USA.Google Scholar
  6. 6.
    Du, Wenliang and Zhan, Zhijun (2002). Building decision tree classifier on private data. In Clifton, Chris and Estivill-Castro, Vladimir, editors, IEEE International Conference on Data Mining Workshop on Privacy, Security, and Data Mining, volume 14, pages 1–8, Maebashi City, Japan. Australian Computer Society.Google Scholar
  7. 7.
    Feigenbaum, Joan, Ishai, Yuval, Malkin, Tal, Nissim, Kobbi, Strauss, Martin J., and Wright, Rebecca N. (2006). Secure multiparty computation of approximations. ACM Trans. Algorithms, 2(3):435–472.CrossRefMathSciNetGoogle Scholar
  8. 8.
    Feingold, Mr., Corzine, Mr., Wyden, Mr., and Nelson, Mr. (2003). Data Mining Moratorium Act of 2003. U.S. Senate Bill (proposed).Google Scholar
  9. 9.
    Freedman, Michael J., Nissim, Kobbi, and Pinkas, Benny (2004). Efficient private matching and set intersection. In Eurocrypt 2004, Interlaken, Switzerland. International Association for Cryptologic Research (IACR).Google Scholar
  10. 10.
    Friedman, Arik, Wolff, Ran, and Schuster, Assaf (to appear). Providing k-anonymity in data mining. VLDB Journal.Google Scholar
  11. 11.
    Fukunaga, Keinosuke (1990). Introduction to Statistical Pattern Recognition. Academic Press, San Diego, CA.zbMATHGoogle Scholar
  12. 12.
    Goethals, Bart, Laur, Sven, Lipmaa, Helger, and Mielikäinen, Taneli (2004). On Secure Scalar Product Computation for Privacy-Preserving Data Mining. In Park, Choonsik and Chee, Seongtaek, editors, The 7th Annual International Conference in Information Security and Cryptology (ICISC 2004), volume 3506, pages 104–120.Google Scholar
  13. 13.
    Goldreich, Oded (2004). The Foundations of Cryptography, volume 2, chapter General Cryptographic Protocols. Cambridge University Press.Google Scholar
  14. 14.
    Ioannidis, Ioannis, Grama, Ananth, and Atallah, Mikhail (2002). A secure protocol for computing dot-products in clustered and distributed environments. In The 2002 International Conference on Parallel Processing, Vancouver, British Columbia.Google Scholar
  15. 15.
    Jagannathan, Geetha and Wright, Rebecca N. (2005). Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In Proceedings of the 2005 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 593–599, Chicago, IL.Google Scholar
  16. 16.
    Jiang, Wei, Clifton, Chris, and Kantarcioglu, Murat (To appear.). Transforming semi-honest protocols to ensure accountability. Data and Knowledge Engineering.Google Scholar
  17. 17.
    Kantarcioglu, Murat and Kardes, Onur (2006). Privacy-preserving data mining in malicious model. Technical Report CS-2006-06, Stevens Institute of Technology.Google Scholar
  18. 18.
    Kantarcioglu, Murat and Vaidya, Jaideep (2003). Privacy preserving naive bayes classifier for horizontally partitioned data. In the Workshop on Privacy Preserving Data Mining held in association with The Third IEEE International Conference on Data Mining, Melbourne, FL.Google Scholar
  19. 19.
    Kantarcıoğlu, Murat and Clifton, Chris (2002). Privacy-preserving distributed mining of association rules on horizontally partitioned data. In The ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD’02), pages 24–31, Madison, Wisconsin.Google Scholar
  20. 20.
    Kantarcıoğlu, Murat and Clifton, Chris (2004a). Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE TKDE, 16(9):1026–1037.Google Scholar
  21. 21.
    Kantarcıoğlu, Murat and Clifton, Chris (2004b). Privately computing a distributed k-nn classifier. In Boulicaut, Jean-Franois, Esposito, Floriana, Giannotti, Fosca, and Pedreschi, Dino, editors, PKDD2004: 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, pages 279–290, Pisa, Italy.Google Scholar
  22. 22.
    Kantarcıoğlu, Murat, Jin, Jiashun, and Clifton, Chris (2004). When do data mining results violate privacy? In Proceedings of the 2004 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 599–604, Seattle, WA.Google Scholar
  23. 23.
    Kissner, L. and Song, D. (2005). Privacy-preserving set operations. In Advances in Cryptology — CRYPTO 2005.Google Scholar
  24. 24.
    Lin, Xiaodong, Clifton, Chris, and Zhu, Michael (2005). Privacy preserving clustering with distributed EM mixture modeling. Knowledge and Information Systems, 8(1):68–81.CrossRefGoogle Scholar
  25. 25.
    Lindell, Yehuda and Pinkas, Benny (2000). Privacy preserving data mining. In Advances in Cryptology – CRYPTO 2000, pages 36–54. Springer-Verlag.Google Scholar
  26. 26.
    Lindell, Yehuda and Pinkas, Benny (2002). Privacy preserving data mining. Journal of Cryptology, 15(3):177–206.zbMATHCrossRefMathSciNetGoogle Scholar
  27. 27.
    Mitchell, Tom (1997). Machine Learning. McGraw-Hill Science/Engineering/Math, 1st edition.Google Scholar
  28. 28.
    Naor, Moni and Pinkas, Benny (1999). Oblivious transfer and polynomial evaluation. In Proceedings of the Thirty-first Annual ACM Symposium on Theory of Computing, pages 245–254, Atlanta, Georgia, United States. ACM Press.CrossRefGoogle Scholar
  29. 29.
    Paillier, P. (1999). Public key cryptosystems based on composite degree residuosity classes. In Advances in Cryptology - Eurocrypt ’99 Proceedings, LNCS 1592, pages 223–238. Springer-Verlag.Google Scholar
  30. 30.
    Perry, John M. (2005). Statement of john m. perry, president and ceo, cardsystems solutions, inc. before the united states house of representatives subcommittee on oversight and investigations of the committee on financial services.
  31. 31.
    Vaidya, Jaideep and Clifton, Chris (2002). Privacy preserving association rule mining in vertically partitioned data. In The Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 639–644, Edmonton, Alberta, Canada.Google Scholar
  32. 32.
    Vaidya, Jaideep and Clifton, Chris (2005). Secure set intersection cardinality with application to association rule mining. Journal of Computer Security, 13(4).Google Scholar
  33. 33.
    Yao, Andrew C. (1986). How to generate and exchange secrets. In Proceedings of the 27th IEEE Symposium on Foundations of Computer Science, pages 162–167. IEEE.Google Scholar
  34. 34.
    Yu, Hwanjo, Jiang, Xiaoqian, and Vaidya, Jaideep (2006). Privacy-preserving svm using nonlinear kernels on horizontally partitioned data. In SAC ’06: Proceedings of the 2006 ACM symposium on Applied computing, pages 603–610, New York, NY, USA. ACM Press.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Murat Kantarcioglu
    • 1
  1. 1.Computer Science DepartmentUniversity of Texas at DallasUSA

Personalised recommendations