Databases Traceability by Means of Watermarking with Optimized Detection
In this paper, we propose a robust lossless database watermarking scheme the detection of which is optimized for the traceability of databases merged into, for example, shared data warehouses. We basically aim at identifying a database merged with different other watermarked databases. Based on the modulation of attribute circular histogram’s center of mass, we theoretically prove that the impact of the database mixture on the embedded identifier is equivalent to the addition of a Gaussian noise, the parameters of which can be estimated. From these theoretical results, an optimized watermark detector is proposed. This one offers higher discriminative performance than the classic correlation-based detector. Depending on the modulated attribute, it allows us to detect a database representing at least \(4\%\) of the databases mixture with a detection rate close to \(100\%\). These results have been experimentally verified within the framework of a set of medical databases containing inpatient hospital stay records.
KeywordsTraceability Watermarking Relational database Information security
The authors are very grateful to the Department of Medical Information and Archives, CHU Lille; UDSL EA 2694; Univ. Lille Nord de France; F-59000 Lille, France, for the experimental data used in this study.
- 1.Agrawal, R., Kiernan, J.: Chapter 15 - watermarking relational databases. In: VLDB 2002: Proceedings of the 28th International Conference on Very Large Databases, pp. 155–166 (2002)Google Scholar
- 5.Coatrieux, G., Chazard, E., Beuscart, R., Roux, C.: Lossless watermarking of categorical attributes for verifying medical data base integrity. In: Proceedings of the IEEE EMBC, pp. 8195–8198. IEEE (2011)Google Scholar
- 9.Harvard Business Review Analytic Services: The evolution of decision making: How leading organizations are adopting a data-driven culture (2012)Google Scholar
- 11.Kuribayashi, M.: A simple tracing algorithm for binary fingerprinting code under averaging attack. In: Proceedings of the First ACM Workshop on Information Hiding and Multimedia Security, (IH&MMSec 2013), NY, USA, pp. 3–12. ACM, New York (2013)Google Scholar
- 13.Li, Y., Guo, H., Jajodia, S.: Tamper detection and localization for categorical data using fragile watermarks. In: Proceedings of ACM Workshop on Digital Rights Management (DRM 2004), pp. 73–82 (2004)Google Scholar
- 15.McNickle, M.: Top. 10 data security breaches in 2012. http://www.healthcarefinancenews.com/news/top-10-data-security-breaches-2012 in Healthcare Finance News. Accessed 21 July 2016
- 17.Quinn, B.: Phase-only information loss. In: 2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), pp. 3982–3985, March 2010Google Scholar
- 21.Tardos, G.: Optimal probabilistic fingerprint codes. In: Proceedings of the Thirty-Fifth Annual ACM Symposium on Theory of Computing (STOC 2003), pp. 116–125 (2003)Google Scholar