Constraint-Driven Database Repair
Data reconciliation; Data standardization; Minimal-change integrity maintenance
Given a set Σ of integrity constraints and a database instance D of a schema R, the problem of constraint-driven database repair is to find an instance D′ of the same schema R such that (i) D′ is consistent, i.e., D′ satisfies Σ, and moreover, (ii) D′minimally differs from the original database D, i.e., it takes a minimal number of repair operations or incurs minimal cost to obtain D′ by updating D.
Real-life data is often dirty, i.e., inconsistent, inaccurate, stale, or deliberately falsified. While the prevalent use of the Web has made it possible, on an unprecedented scale, to extract and integrate data from diverse sources, it has also increased the risks of creating and propagating dirty data. Dirty data routinely leads to misleading or biased analytical results and decisions and incurs loss of revenue, credibility, and customers. With this comes the need for...
- 1.Arenas M, Bertossi LE, Chomicki J. Consistent query answers in inconsistent databases. In: Proceedings of the 18th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 1999. p. 68–79.Google Scholar
- 2.Bohannon P, Fan W, Flaster M, Rastogi R. A cost-based model and effective heuristic for repairing constraints by value modification. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2005. p. 143–54.Google Scholar
- 3.Bravo L, Fan W, Ma S. Extending dependencies with conditions. In: Proceedings of the 33rd International Conference on Very Large Data Bases; 2007. p. 243–54.Google Scholar
- 4.Calì A, Lembo D, Rosati R. On the decidability and complexity of query answering over inconsistent and incomplete databases. In: Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2003. p. 260–71.Google Scholar
- 5.Cao Y, Fan W, Yu W. Determining the relative accuracy of attributes. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2013. p. 565–76.Google Scholar
- 6.Chomicki J. Consistent query answering: five easy pieces. In: Proceedings of the 11th International Conference on Database Theory; 2007. p. 1–17.Google Scholar
- 8.Chomicki J, Marcinkowski J. On the computational complexity of minimal-change integrity maintenance in relational databases. Inconsistency Tolerance. 2005. Lecture Notes in Computer Science 3300:119–150.Google Scholar
- 9.Cong G, Fan W, Geerts F, Jia X, Ma S. Improving data quality: consistency and accuracy. In: Proceedings of the 33rd International Conference on Very Large Data Bases; 2007. p. 315–26.Google Scholar
- 11.Fan W, Geerts F, Jia X, Kementsietsidis A. Conditional functional dependencies for capturing data inconsistencies. ACM Trans Database Syst. NY, USA: 2008;33(2):1–48.Google Scholar
- 12.Fan W, Li J, Ma S, Tang N, Yu W. Interaction between record matching and data repairing. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2011. p. 469–80.Google Scholar
- 15.Lopatenko A, Bertossi LE. Complexity of consistent query answering in databases under cardinality-based and incremental repair semantics. In: Proceedings of the 11th International Conference on Database Theory; 2007. p. 179–93.Google Scholar