Skip to main content

Privacy Preserving BIRCH Algorithm for Clustering over Vertically Partitioned Databases

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4165))

Abstract

BIRCH algorithm, introduced by Zhang et al. [15], is a well known algorithm for effectively finding clusters in a large data set. The two major components of the BIRCH algorithm are CF tree construction and global clustering. However BIRCH algorithm is basically designed as an algorithm working on a single database. We propose the first novel method for running BIRCH over a vertically partitioned data sets, distributed in two different databases in a privacy preserving manner. We first provide efficient solutions to crypto primitives such as finding minimum index in a vector sum and checking if sum of two private values exceed certain threshold limit. We then use these primitives as basic tools to arrive at secure solutions to CF tree construction and single link clustering for implementing BIRCH algorithm.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, D., Aggarwal, C.C.: On the Design and Quantification of Privacy Preserving Data Mining Algorithms. In: Proceedings of the Twentieth ACM SIGACT - SIGMOD - SIGART Symposium on Principles of Database Systems, May 21-23, 2001, pp. 247–255. ACM, Santa Barbara (2001)

    Chapter  Google Scholar 

  2. Agrawal, R., Srikant, R.: Privacy preserving data mining. In: Proceedings of the 2000 ACM SIGMOD Conference on Management of Data, Dallas, TX, May 14-19, 2000. ACM Press, New York (2000)

    Google Scholar 

  3. Goethals, B., Laur, S., Lipmaa, H., Mielikainen, T.: On private scalar product computation for privacy-preserving data mining. In: Park, C.-s., Chee, S. (eds.) ICISC 2004. LNCS, vol. 3506, pp. 104–120. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  4. Cachin, C.: Efficient private bidding and auctions with an oblivious third party. In: Proceedings of 6th ACM Computer and communications security, SIGSAC, pp. 120–127. ACM Press, New York (1999)

    Chapter  Google Scholar 

  5. Damgard, I., Jurik, M.: A Generalisation, a Simplification and Some Applications of Paillier’s Probabilistic Public-Key System. In: Kim, K.-c. (ed.) PKC 2001. LNCS, vol. 1992, pp. 119–136. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  6. Jagannathan, G., Pillaipakkamnatt, K., Wright, R.N.: A New Privacy-Preserving Distributed k-Clustering Algorithm. In: Proceedings of the 2006 SIAM International Conference on Data Mining (SDM) (2006)

    Google Scholar 

  7. Jagannathan, G., Wright, R.N.: Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2005, Chicago, Illinois, USA, August 21-24, 2005. ACM, New York (2005)

    Google Scholar 

  8. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data, ch. 3. Prentice-Hall Inc., Englewood Cliffs (1988)

    MATH  Google Scholar 

  9. Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 36–54. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  10. Natan, R.B.: Implementing Database Security and Auditing, ch. 11. Elsevier, Amsterdam (2005)

    Google Scholar 

  11. Oliveira, S., Zaiane, O.R.: Privacy preserving clustering by data transformation. In: Proceedings of the 18th Brazilian Symposium on Databases, pp. 304–318 (2003)

    Google Scholar 

  12. Paillier, P.: Public-key Cryptosystems Based on Composite Degree Residuosity Classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999)

    Google Scholar 

  13. Rivest, R., Adleman, L., Dertouzos, M.: On data banks and privacy homomorphisms. In: Foundations of Secure Computation, pp. 169–178. Academic Press, London (1978)

    Google Scholar 

  14. Jha, S., Kruger, L., McDaniel, P.: Privacy Preserving Clustering. In: di Vimercati, S.d.C., Syverson, P.F., Gollmann, D. (eds.) ESORICS 2005. LNCS, vol. 3679, pp. 397–417. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  15. Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: An efficient Data Clustering Method of Very Large Databases. In: Proceedings of the ACM SIGMOD Conference on Management of Data, Montreal, Canada, pp. 103–114 (June 1996)

    Google Scholar 

  16. Vaidya, J., Clifton, C.: Privacy preserving association rule mining in vertically partitioned data. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada, July 23-26, 2002, pp. 639–644. ACM, New York (2002)

    Chapter  Google Scholar 

  17. Vaidya, J., Clifton, C.: Privacy-preserving k-means clustering over vertically partitioned data. In: Proceedings of the 9th ACM SIGKDD International Conference on knowledge Discovery and Data Mining, Washington, DC, USA, August 24-27, 2003. ACM, New York (2003)

    Google Scholar 

  18. Yao, A.C.: Protocols for secure computation. In: Proceedings of 23rd IEEE Symposium on Foundations of Computer Science, pp. 160–164. IEEE Computer Society Press, Los Alamitos (1982)

    Google Scholar 

  19. Yao, A.C.: How to generate and exchange secrets. In: Proceedings of the 27th IEEE Symp. on Foundations of Computer Science, Toronto, Ontario, Canada, October 27 - 29, 1986, pp. 162–167 (1986)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Prasad, P.K., Rangan, C.P. (2006). Privacy Preserving BIRCH Algorithm for Clustering over Vertically Partitioned Databases. In: Jonker, W., Petković, M. (eds) Secure Data Management. SDM 2006. Lecture Notes in Computer Science, vol 4165. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11844662_7

Download citation

  • DOI: https://doi.org/10.1007/11844662_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-38984-2

  • Online ISBN: 978-3-540-38987-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics