Privacy-Preserving Subgraph Discovery

  • Danish Mehmood
  • Basit Shafiq
  • Jaideep Vaidya
  • Yuan Hong
  • Nabil Adam
  • Vijayalakshmi Atluri
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7371)


Graph structured data can be found in many domains and applications. Analysis of such data can give valuable insights. Frequent subgraph discovery, the problem of finding the set of subgraphs that is frequent among the underlying database of graphs, has attracted a lot of recent attention. Many algorithms have been proposed to solve this problem. However, all assume that the entire set of graphs is centralized at a single site, which is not true in a lot of cases. Furthermore, in a lot of interesting applications, the data is sensitive (for example, drug discovery, clique detection, etc). In this paper, we address the problem of privacy-preserving subgraph discovery. We propose a flexible approach that can utilize any underlying frequent subgraph discovery algorithm and uses cryptographic primitives to preserve privacy. The comprehensive experimental evaluation validates the feasibility of our approach.


Local Candidate Support Threshold Homomorphic Encryption Global Threshold Local Threshold 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases, September 12-15, pp. 487–499. VLDB, Santiago (1994), Google Scholar
  2. 2.
    Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proceedings of the 2000 ACM SIGMOD Conference on Management of Data, May 14-19, pp. 439–450. ACM, Dallas (2000), CrossRefGoogle Scholar
  3. 3.
    Chittimoori, R.N., Holder, L.B., Cook, D.J.: Applying the subdue substructure discovery system to the chemical toxicity domain. In: Proceedings of the Twelfth International Florida Artificial Intelligence Research Society Conference, pp. 90–94. AAAI Press (1999),
  4. 4.
    Du, W., Zhan, Z.: Building decision tree classifier on private data. In: Clifton, C., Estivill-Castro, V. (eds.) IEEE International Conference on Data Mining Workshop on Privacy, Security, and Data Mining, December 9, vol. 14, pp. 1–8. Australian Computer Society, Maebashi City (2002), Google Scholar
  5. 5.
    Goldreich, O., Micali, S., Wigderson, A.: How to play any mental game - a completeness theorem for protocols with honest majority. In: Proceedings of the 19th ACM Symposium on the Theory of Computing, pp. 218–229. ACM, New York (1987), Google Scholar
  6. 6.
    Gudes, E., Shimony, S.E., Vanetik, N.: Discovering frequent graph patterns using disjoint paths. IEEE Trans. on Knowl. and Data Eng. 18, 1441–1456 (2006), CrossRefzbMATHGoogle Scholar
  7. 7.
    Inokuchi, A., Washio, T., Motoda, H.: Complete mining of frequent patterns from graphs: Mining graph data. Mach. Learn. 50, 321–354 (2003), CrossRefzbMATHGoogle Scholar
  8. 8.
    Jagannathan, G., Wright, R.N.: Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: Proceedings of the 2005 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 21-24, pp. 593–599. ACM, Chicago (2005)Google Scholar
  9. 9.
    Karr, A.F., Lin, X., Sanil, A.P., Reiter, J.P.: Secure regressions on distributed databases. Journal of Computational and Graphical Statistics 14, 263–279 (2005)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Cercone, N., Lin, T.Y., Wu, X. (eds.) ICDM, pp. 313–320. IEEE Computer Society (2001)Google Scholar
  11. 11.
    Lindell, Y., Pinkas, B.: Privacy preserving data mining. Journal of Cryptology 15(3), 177–206 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Mukherjee, M.: Graph-based data mining for social network analysis. In: Proceedings of the ACM KDD Workshop on Link Analysis and Group Detection (2004)Google Scholar
  13. 13.
    Paillier, P.: Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  14. 14.
    Pohlig, S.C., Hellman, M.E.: An improved algorithm for computing logarithms over GF(p) and its cryptographic significance. IEEE Transactions on Information Theory IT-24, 106–110 (1978)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Rakhshan, A., Holder, L.B., Cook, D.J.: Structural web search engine. International Journal on Artificial Intelligence Tools 13(1), 27–44 (2004)CrossRefGoogle Scholar
  16. 16.
    Sanil, A.P., Karr, A.F., Lin, X., Reiter, J.P.: Privacy preserving regression modelling via distributed computation. In: KDD 2004: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 677–682. ACM Press, New York (2004)CrossRefGoogle Scholar
  17. 17.
    Su, S., Cook, D.J., Holder, L.B.: Application of knowledge discovery to molecular biology: Identifying structural regularities in proteins. In: Pacific Symposium on Biocomputing, pp. 190–201 (1999)Google Scholar
  18. 18.
    Vaidya, J., Clifton, C.: Privacy-preserving k-means clustering over vertically partitioned data. In: The Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 24-27, pp. 206–215. ACM, Washington, DC (2003), CrossRefGoogle Scholar
  19. 19.
    Vaidya, J., Clifton, C.: Privacy preserving naïve bayes classifier for vertically partitioned data. In: 2004 SIAM International Conference on Data Mining, April 22-24, pp. 522–526. SIAM, Philadelphia (2004)CrossRefGoogle Scholar
  20. 20.
    Vaidya, J., Clifton, C.: Privacy-preserving outlier detection. In: Proceedings of the Fourth IEEE International Conference on Data Mining (ICDM 2004), November 1-4, pp. 233–240. IEEE Computer Society Press, Los Alamitos (2004)CrossRefGoogle Scholar
  21. 21.
    Vaidya, J., Clifton, C.: Secure set intersection cardinality with application to association rule mining. Journal of Computer Security 13(4), 593–622 (2005)CrossRefGoogle Scholar
  22. 22.
    Vaidya, J., Clifton, C., Kantarcioglu, M., Patterson, A.S.: Privacy-preserving decision trees over vertically partitioned data. ACM Trans. Knowl. Discov. Data 2(3), 1–27 (2008)CrossRefGoogle Scholar
  23. 23.
    Yan, X., Han, J.: gspan: Graph-based substructure pattern mining. In: ICDM, pp. 721–724 (2002)Google Scholar
  24. 24.
    Yao, A.C.: How to generate and exchange secrets. In: Proceedings of the 27th IEEE Symposium on Foundations of Computer Science, pp. 162–167. IEEE Computer Society, Los Alamitos (1986)Google Scholar
  25. 25.
    Zhu, Y., Liu, L.: Optimal randomization for privacy preserving data mining. In: KDD 2004: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 761–766. ACM Press, New York (2004)CrossRefGoogle Scholar

Copyright information

© IFIP International Federation for Information Processing 2012

Authors and Affiliations

  • Danish Mehmood
    • 1
  • Basit Shafiq
    • 1
    • 2
  • Jaideep Vaidya
    • 2
  • Yuan Hong
    • 2
  • Nabil Adam
    • 2
  • Vijayalakshmi Atluri
    • 2
  1. 1.Lahore University of Management SciencesPakistan
  2. 2.CIMICRutgers UniversityUSA

Personalised recommendations