Privacy Preserving BIRCH Algorithm for Clustering over Vertically Partitioned Databases

Prasad, P. Krishna; Rangan, C. Pandu

doi:10.1007/11844662_7

Privacy Preserving BIRCH Algorithm for Clustering over Vertically Partitioned Databases

P. Krishna Prasad¹⁸ &
C. Pandu Rangan¹⁸

Conference paper

542 Accesses
7 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4165))

Abstract

BIRCH algorithm, introduced by Zhang et al. [15], is a well known algorithm for effectively finding clusters in a large data set. The two major components of the BIRCH algorithm are CF tree construction and global clustering. However BIRCH algorithm is basically designed as an algorithm working on a single database. We propose the first novel method for running BIRCH over a vertically partitioned data sets, distributed in two different databases in a privacy preserving manner. We first provide efficient solutions to crypto primitives such as finding minimum index in a vector sum and checking if sum of two private values exceed certain threshold limit. We then use these primitives as basic tools to arrive at secure solutions to CF tree construction and single link clustering for implementing BIRCH algorithm.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, D., Aggarwal, C.C.: On the Design and Quantification of Privacy Preserving Data Mining Algorithms. In: Proceedings of the Twentieth ACM SIGACT - SIGMOD - SIGART Symposium on Principles of Database Systems, May 21-23, 2001, pp. 247–255. ACM, Santa Barbara (2001)
Chapter Google Scholar
Agrawal, R., Srikant, R.: Privacy preserving data mining. In: Proceedings of the 2000 ACM SIGMOD Conference on Management of Data, Dallas, TX, May 14-19, 2000. ACM Press, New York (2000)
Google Scholar
Goethals, B., Laur, S., Lipmaa, H., Mielikainen, T.: On private scalar product computation for privacy-preserving data mining. In: Park, C.-s., Chee, S. (eds.) ICISC 2004. LNCS, vol. 3506, pp. 104–120. Springer, Heidelberg (2005)
Chapter Google Scholar
Cachin, C.: Efficient private bidding and auctions with an oblivious third party. In: Proceedings of 6th ACM Computer and communications security, SIGSAC, pp. 120–127. ACM Press, New York (1999)
Chapter Google Scholar
Damgard, I., Jurik, M.: A Generalisation, a Simplification and Some Applications of Paillier’s Probabilistic Public-Key System. In: Kim, K.-c. (ed.) PKC 2001. LNCS, vol. 1992, pp. 119–136. Springer, Heidelberg (2001)
Chapter Google Scholar
Jagannathan, G., Pillaipakkamnatt, K., Wright, R.N.: A New Privacy-Preserving Distributed k-Clustering Algorithm. In: Proceedings of the 2006 SIAM International Conference on Data Mining (SDM) (2006)
Google Scholar
Jagannathan, G., Wright, R.N.: Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2005, Chicago, Illinois, USA, August 21-24, 2005. ACM, New York (2005)
Google Scholar
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data, ch. 3. Prentice-Hall Inc., Englewood Cliffs (1988)
MATH Google Scholar
Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 36–54. Springer, Heidelberg (2000)
Chapter Google Scholar
Natan, R.B.: Implementing Database Security and Auditing, ch. 11. Elsevier, Amsterdam (2005)
Google Scholar
Oliveira, S., Zaiane, O.R.: Privacy preserving clustering by data transformation. In: Proceedings of the 18th Brazilian Symposium on Databases, pp. 304–318 (2003)
Google Scholar
Paillier, P.: Public-key Cryptosystems Based on Composite Degree Residuosity Classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999)
Google Scholar
Rivest, R., Adleman, L., Dertouzos, M.: On data banks and privacy homomorphisms. In: Foundations of Secure Computation, pp. 169–178. Academic Press, London (1978)
Google Scholar
Jha, S., Kruger, L., McDaniel, P.: Privacy Preserving Clustering. In: di Vimercati, S.d.C., Syverson, P.F., Gollmann, D. (eds.) ESORICS 2005. LNCS, vol. 3679, pp. 397–417. Springer, Heidelberg (2005)
Chapter Google Scholar
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: An efficient Data Clustering Method of Very Large Databases. In: Proceedings of the ACM SIGMOD Conference on Management of Data, Montreal, Canada, pp. 103–114 (June 1996)
Google Scholar
Vaidya, J., Clifton, C.: Privacy preserving association rule mining in vertically partitioned data. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada, July 23-26, 2002, pp. 639–644. ACM, New York (2002)
Chapter Google Scholar
Vaidya, J., Clifton, C.: Privacy-preserving k-means clustering over vertically partitioned data. In: Proceedings of the 9th ACM SIGKDD International Conference on knowledge Discovery and Data Mining, Washington, DC, USA, August 24-27, 2003. ACM, New York (2003)
Google Scholar
Yao, A.C.: Protocols for secure computation. In: Proceedings of 23rd IEEE Symposium on Foundations of Computer Science, pp. 160–164. IEEE Computer Society Press, Los Alamitos (1982)
Google Scholar
Yao, A.C.: How to generate and exchange secrets. In: Proceedings of the 27th IEEE Symp. on Foundations of Computer Science, Toronto, Ontario, Canada, October 27 - 29, 1986, pp. 162–167 (1986)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology, Madras, Chennai, 600036, India
P. Krishna Prasad & C. Pandu Rangan

Authors

P. Krishna Prasad
View author publications
You can also search for this author in PubMed Google Scholar
C. Pandu Rangan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Philips Research, The Netherlands
Willem Jonker
Philips Research, Information & System Security, High Tech Campus 37 (WY 71), 5656, Eindhoven, AE, The Netherlands
Milan Petković

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Prasad, P.K., Rangan, C.P. (2006). Privacy Preserving BIRCH Algorithm for Clustering over Vertically Partitioned Databases. In: Jonker, W., Petković, M. (eds) Secure Data Management. SDM 2006. Lecture Notes in Computer Science, vol 4165. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11844662_7

Download citation

DOI: https://doi.org/10.1007/11844662_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-38984-2
Online ISBN: 978-3-540-38987-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics