Privacy-Awareness of Distributed Data Clustering Algorithms Revisited

  • Josenildo C. da SilvaEmail author
  • Matthias Klusch
  • Stefano Lodi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9897)


Several privacy measures have been proposed in the privacy-preserving data mining literature. However, privacy measures either assume centralized data source or that no insider is going to try to infer some information. This paper presents distributed privacy measures that take into account collusion attacks and point level breaches for distributed data clustering. An analysis of representative distributed data clustering algorithms show that collusion is an important source of privacy issues and that the analyzed algorithms exhibit different vulnerabilities to collusion groups.


Inference Attack Collusion Attack Malicious Peer Average Privacy Secure Multiparty Computation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was partly supported by the EU-funded project TOREADOR (contract n. H2020-688797)


  1. 1.
    Agrawal, D., Aggarwal, C.C.: On the design and quantification of privacy preserving data mining algorithms. In Proceedings of the 20th Symposium on Principles of Database Systems (PODS), pp. 247–255. ACM, May 2001Google Scholar
  2. 2.
    Bertino, E., Fovino, I., Provenza, L.: A framework for evaluating privacy preserving data mining algorithms. Data Min. Knowl. Discov. 11(2), 121–154 (2005)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Clifton, C., Kantarcioglu, M., Vaidya, J., Lin, X., Zhu, M.: Tools for privacy preserving data mining. ACM SIGKDD Explor. Newsl. 4(2), 28–34 (2002)CrossRefGoogle Scholar
  4. 4.
    Forman, G., Zhang, B.: Distributed data clustering can be efficient and exact. SIGKDD Explor. Newsl. 2(2), 34–38 (2000)CrossRefGoogle Scholar
  5. 5.
    Goldreich, O.: Foundations of Cryptography: Basic Applications, vol. 2. Cambridge University Press, Cambridge (2004)CrossRefzbMATHGoogle Scholar
  6. 6.
    Jones, C., Hall, J., Hale, J.: Secure distributed database mining: principle of design. In: Advances in Distributed and Parallel Knowledge Discovery, Chap. 10, pp. 277–294. AAAI Press/MIT Press, Menlo Park (2000)Google Scholar
  7. 7.
    Kantarcioglu, M.: A survey of privacy-preserving methods across horizontally partitioned data. In: Aggarwal, C.C., Yu, P.S. (ed.) Privacy-Preserving Data Mining. The Kluwer International Series on Advances in Database Systems, vol. 34, pp. 313–335. Springer, New York (2008)Google Scholar
  8. 8.
    Klusch, M., Lodi, S., Moro, G.: Agent-based distributed data mining: the KDEC scheme. In: Klusch, M., Bergamaschi, S., Edwards, P., Petta, P. (eds.) Intelligent Information Agents. LNCS (LNAI), vol. 2586, pp. 104–122. Springer, Heidelberg (2003). doi: 10.1007/3-540-36561-3_5 Google Scholar
  9. 9.
    Lindell, Y., Pinkas, B.: Secure multiparty computation for privacy-preserving data mining. J. Priv. Confidentiality 1(1), 5 (2009)Google Scholar
  10. 10.
    Merugu, S., Ghosh, J.: Privacy-preserving distributed clustering using generative models. In: Proceedings of the 3rd International Conference on Data Mining (ICDM). IEEE (2003)Google Scholar
  11. 11.
    Merugu, S., Ghosh, J.: A privacy-sensitive approach to distributed clustering. Pattern Recogn. Lett. 26, 399–410 (2005)CrossRefGoogle Scholar
  12. 12.
    Patel, S.J., Punjani, D., Jinwala, D.C.: An efficient approach for privacy preserving distributed clustering in semi-honest model using elliptic curve cryptography. Int. J. Netw. Secur. 17(3), 328–339 (2015)Google Scholar
  13. 13.
    Provost, F.: Distributed data mining: scaling up and beyond. In: Advances in Distributed and Parallel Knowledge Discovery, pp. 3–27. AAAI Press, Palo Alto (2000)Google Scholar
  14. 14.
    Shen, P., Li, C.: Distributed information theoretic clustering. IEEE Trans. Signal Process. 62(13), 3442–3453 (2014)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Vaidya, J., Clifton, C.: Privacy-preserving k-means clustering over vertically partitioned data. In: Proceedings of the 9th International Confernce on Knowledge Discovery and Data Mining (KDD), pp. 206–215. ACM (2003)Google Scholar
  16. 16.
    Zaki, M.J.: Parallel and distributed data mining: an introduction. In: Zaki, M.J., Ho, C.-T. (eds.) LSPDM 1999. LNCS (LNAI), vol. 1759, pp. 1–23. Springer, Heidelberg (2000). doi: 10.1007/3-540-46502-2_1 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Josenildo C. da Silva
    • 1
    Email author
  • Matthias Klusch
    • 2
  • Stefano Lodi
    • 3
  1. 1.Depto. de InformáticaInstituto Federal do Maranhão (IFMA)São LuísBrazil
  2. 2.DFKI GmbHSaarbrückenGermany
  3. 3.Dipartimento di Informatica - Scienza e IngegneriaBolognaItaly

Personalised recommendations