Cardinality Constraints with Probabilistic Intervals
Probabilistic databases accommodate well the requirements of modern applications that produce large volumes of uncertain data from a variety of sources. We propose an expressive class of probabilistic cardinality constraints which empowers users to specify lower and upper bounds on the marginal probabilities by which cardinality constraints should hold in a data set of acceptable quality. The bounds help organizations balance the consistency and completeness targets for their data quality, and provide probabilities on the number of query answers without querying the data. Algorithms are established for an agile schema-driven acquisition of the right lower and upper bounds in a given application domain, and for reasoning about the constraints.
KeywordsCardinality constraint Data and knowledge intelligence Decision support Probability Requirements engineering Summaries
- 13.Roblot, T.: Cardinality constraints for probabilistic and possibilistic databases. Ph.D. thesis, Department of Computer Science, The University of Auckland (2016)Google Scholar
- 15.Roblot, T.K., Link, S.: URD: a data summarization tool for the acquisition of meaningful cardinality constraints with probabilistic intervals. In: 33rd IEEE International Conference on Data Engineering, ICDE 2017, San Diego, CA, USA, 19–22 April 2017, pp. 1379–1380. IEEE Computer Society (2017)Google Scholar
- 16.Saha, B., Srivastava, D.: Data quality: the other face of big data. In: Cruz, I.F., Ferrari, E., Tao, Y., Bertino, E., Trajcevski, G. (eds.) IEEE 30th International Conference on Data Engineering, ICDE 2014, Chicago, IL, USA, March 31–April 4 2014, pp. 1294–1297. IEEE Computer Society (2014)Google Scholar