Skip to main content

Explaining Query Answer Completeness and Correctness with Partition Patterns

  • Conference paper
  • First Online:
Database and Expert Systems Applications (DEXA 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11707))

Included in the following conference series:

Abstract

Information incompleteness is a major data quality issue which is amplified by the increasing amount of data collected from unreliable sources. Assessing the completeness of data is crucial for determining the quality of the data itself, but also for verifying the validity of query answers over incomplete data. In this article, we tackle the issue of efficiently describing and inferring knowledge about data completeness w.r.t. to a complete reference data set and study the use of a partition pattern algebra for summarizing the completeness and validity of query answers. We describe an implementation and experiments with a real-world dataset to validate the effectiveness and the efficiency of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Two pattern tables are equivalent if their instances in R are equal.

References

  1. Bidoit, N., Herschel, M., Tzompanaki, A.: Efficient computation of polynomial explanations of why-not questions. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 713–722 (2015)

    Google Scholar 

  2. Fan, W., Geerts, F.: Relative information completeness. ACM Trans. Database Syst. 35(4), 27:1–27:44 (2010)

    Article  Google Scholar 

  3. Hannou, F.Z., Amann, B., Baazizi, M.A.: Explaining query answer completeness and correctness using partition patterns (long version). Technical report (2019). http://www-bd.lip6.fr/wiki/site/recherche/articles/start

  4. Herschel, M., Hernández, M.A.: Explaining missing answers to SPJUA queries. Proc. VLDB Endow. 3(1–2), 185–196 (2010)

    Article  Google Scholar 

  5. Imieliński, T., Lipski, W.: Incomplete information in relational databases. In: Readings in Artificial Intelligence and Databases, pp. 342–360. Elsevier (1988)

    Google Scholar 

  6. Lang, W., Nehme, R.V., Robinson, E., Naughton, J.F.: Partial results in database systems. In: International Conference on Management of Data, SIGMOD, pp. 1275–1286. Snowbird, USA, June 2014

    Google Scholar 

  7. Levy, A.Y.: Obtaining complete answers from incomplete databases. In: Proceedings of the 22th International Conference on Very Large Data Bases, VLDB 1996, pp. 402–412. Morgan Kaufmann Publishers Inc., San Francisco (1996)

    Google Scholar 

  8. Loshin, D.: Master Data Management. Morgan Kaufmann, Burlington (2010)

    MATH  Google Scholar 

  9. Mazón, J.N., Lechtenbörger, J., Trujillo, J.: A survey on summarizability issues in multidimensional modeling. Data Knowl. Eng. 68(12), 1452–1469 (2009)

    Article  Google Scholar 

  10. Motro, A.: Integrity = validity + completeness. ACM Trans. Database Syst. 14(4), 480–502 (1989)

    Article  Google Scholar 

  11. Razniewski, S., Korn, F., Nutt, W., Srivastava, D.: Identifying the extent of completeness of query answers over partially complete databases. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, pp. 561–576, 31 May–4 June 2015

    Google Scholar 

  12. Shoshani, A.: OLAP and statistical databases: similarities and differences. In: Proceedings of the Sixteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 185–196. ACM (1997)

    Google Scholar 

  13. Stonebraker, M., Rowe, L.A.: The design of postgres. SIGMOD Rec. 15(2), 340–355 (1986)

    Article  Google Scholar 

  14. Sundarmurthy, B., Koutris, P., Lang, W., Naughton, J.F., Tannen, V.: m-tables: representing missing data. In: 20th International Conference on Database Theory, ICDT, Venice, Italy, pp. 21:1–21:20, March 2017

    Google Scholar 

  15. Tran, Q.T., Chan, C.Y.: How to conquer why-not questions. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 15–26. ACM (2010)

    Google Scholar 

  16. Zaharia, M., et al.: Apache spark: a unified engine for big data processing. Commun. ACM 59(11), 56–65 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bernd Amann .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hannou, FZ., Amann, B., Baazizi, MA. (2019). Explaining Query Answer Completeness and Correctness with Partition Patterns. In: Hartmann, S., Küng, J., Chakravarthy, S., Anderst-Kotsis, G., Tjoa, A., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2019. Lecture Notes in Computer Science(), vol 11707. Springer, Cham. https://doi.org/10.1007/978-3-030-27618-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-27618-8_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-27617-1

  • Online ISBN: 978-3-030-27618-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics