Generalization Based Privacy-Preserving Provenance Publishing

  • Jian WuEmail author
  • Weiwei Ni
  • Sen Zhang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11242)


With thriving of data sharing, demands of data provenance publishing become increasingly urgent. Data provenance describes about how data is generated and evolves with time. Data provenance has many applications, in-cluding evaluation of data quality, audit trail, replication recipes, data citation, etc. Some in-out mapping relations and related intermediate parameters in data provenance may be private. How to protect the privacy in the data provenance publishing attracts increasing attention from researchers in recent years. Existing solutions rely primarily on Γ-privacy model, hiding certain properties to solve the module’s privacy-preserving problem. However, the Γ-privacy model has the following disadvantages: (1) The attribute domains are limited. (2) It’s difficult to set consistent Γ value for the workflow. (3) The attribute selection strategy is unreasonable. Concerning these problems, a novel privacy-preserving provenance model is devised to balance the tradeoff between privacy-preserving and utility of data provenance. The devised model applies the generalization and introduces the generalized level. Furthermore, an effective privacy-preserving provenance publishing method based on generalization is proposed to achieve the privacy security in the data provenance publishing. Finally, theoretical analysis and experimental results testifies the effectiveness of our solution.


Data provenance Privacy-preserving Generalization Generalized level 


  1. 1.
    Ming, G.A.O., Che-Qing, J.I.N., et al.: A survey on management of data provenance. Chin. J. Comput. 33(3), 373–389 (2010)CrossRefGoogle Scholar
  2. 2.
    Missier, P., Bryans, J., Gamble, C., et al.: Provenance Graph Abstraction by Node Grouping. Computing Science, Newcastle University, Newcastle upon Tyne (2013)Google Scholar
  3. 3.
    Mohy, N.N., Mokhtar, H.M.O., El-Sharkawi, M.E.: A comprehensive sanitization approach for workflow provenance graphs. In: EDBT/ICDT Workshops (2016)Google Scholar
  4. 4.
  5. 5.
    Davidson, S.B., et al.: Privacy issues in scientific workflow provenance. In: Proceedings of the 1st International Workshop on Workflow Approaches to New Data-centric Science. ACM (2010)Google Scholar
  6. 6.
    Chebotko, A., Chang, S., Lu, S., Fotouhi, F., Yang, P.: Scientific workflow provenance querying with security views. In: WAIM, pp. 349–356 (2008)Google Scholar
  7. 7.
    Davidson, S.B., Khanna, S., Milo, T., et al.: Provenance views for module privacy. In: Proceedings of the Thirtieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 175–186. ACM (2011)Google Scholar
  8. 8.
    Davidson, S.B., Khanna, S., Panigrahi, D., et al.: Preserving module privacy in workflow provenance (2010)Google Scholar
  9. 9.
    Davidson, S.B., Khanna, S., Roy, S., et al.: Privacy issues in scientific workflow provenance. In: Proceedings of the 1st International Workshop on Workflow Approaches to New Data-centric Science, p. 3. ACM (2010)Google Scholar
  10. 10.
    Davidson, S.B., Khanna, S., Roy, S., et al.: On provenance and privacy. In: Proceedings of the 14th International Conference on Database Theory, pp. 3–10. ACM (2011)Google Scholar
  11. 11.
    Fung, B., Wang, K., Chen, R., et al.: Privacy-preserving data publishing: a survey of recent developments. ACM Comput. Surv. (CSUR) 42(4), 14 (2010)CrossRefGoogle Scholar
  12. 12.
    Oinn, T., et al.: Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20, 2004 (2004)CrossRefGoogle Scholar
  13. 13.
    Simmhan, Y.L., Plale, B., Gannon, D.: A framework for collecting provenance in data-centric scientific workflows. In: Proceedings of the IEEE International Conference on Web Services, ICWS 2006, pp. 427–436. IEEE Computer Society, Washington, D.C. (2006).
  14. 14.
    Shui-Geng, Z.H.O.U., Feng, L.I., et al.: Privacy preservation in data applications: a survey. Chin. J. Comput. 32(5), 847–861 (2009)CrossRefGoogle Scholar
  15. 15.
    Ludäscher, B., et al.: Scientific workflow management and the kepler system: research articles. Concurr. Comput. Pract. Exper. 18(10), 1039–1065 (2006). Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringSoutheast UniversityNanjingChina
  2. 2.Key Laboratory of Computer Network and Information Integration in Southeast UniversityMinistry of EducationNanjingChina

Personalised recommendations