Advertisement

Detecting the Temporal Context of Queries

  • Oliver KennedyEmail author
  • Ying Yang
  • Jan Chomicki
  • Ronny Fehling
  • Zhen Hua Liu
  • Dieter Gawlick
Conference paper
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 206)

Abstract

Business intelligence and reporting tools rely on a database that accurately mirrors the state of the world. Yet, even if the schema and queries are constructed in exacting detail, assumptions about the data made during extraction, transformation, and schema and query creation of the reporting database may be (accidentally) ignored by end users, or may change as the database evolves over time. As these assumptions are typically implicit (e.g., assuming that a sales record relation is append-only), it can be hard to even detect that a mistaken assumption has been made. In this paper, we argue that such errors are consequences of unintended contextual dependence, i.e., query outputs dependent on a variable characteristic of the database. We characterize contextual dependence, and explore several strategies for efficiently detecting and quantifying the effects of contextual dependence on query outputs. We present and evaluate our findings in the context of a concrete case study: Detecting temporal dependence using a database management system with versioning capabilities.

Keywords

Query Result Base Relation Query Evaluation Contextual Dependence Database Instance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Antova, L., Koch, C., Olteanu, D.: 10\(^{(106)}\) worlds and beyond: efficient representation and processing of incomplete information. VLDBJ 18(5), 1021–1040 (2009)CrossRefGoogle Scholar
  2. 2.
    Bertossi, L., Rizzolo, F., Jiang, L.: Data quality is context dependent. In: Löser, A. (ed.) BIRTE 2010. LNBIP, vol. 84, pp. 52–67. Springer, Heidelberg (2011) CrossRefGoogle Scholar
  3. 3.
    Buneman, P., Khanna, S., Tan, W.-C.: Why and where: a characterization of data provenance. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 316–330. Springer, Heidelberg (2001) CrossRefGoogle Scholar
  4. 4.
    Buneman, P., Tan, W.-C.: Provenance in databases. In: SIGMOD (2007)Google Scholar
  5. 5.
    Chomicki, J., Toman, D.: Temporal databases. Found. Artif. Intell. 1, 429–467 (2005)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Cui, Y., Widom, J., Wiener, J.L.: Tracing the lineage of view data in a warehousing environment. ACM TODS 25(2), 179–227 (2000)CrossRefGoogle Scholar
  7. 7.
    Dadam, P., Lum, V.Y., Werner, H.D.: Integration of time versions into a relational database system. In: VLDB (1984)Google Scholar
  8. 8.
    Dittrich, K.R., Lorie, R.A.: Version support for engineering database systems. IEEE TOSE 14(4), 429–437 (1988)Google Scholar
  9. 9.
    Gartner. Predicts 2014: Business intelligence and analytics will remain cio’s top technology priority. http://www.gartner.com/document/2629220
  10. 10.
    Green, T.J., Karvounarakis, G., Tannen, V.: Provenance semirings. In: PODS (2007)Google Scholar
  11. 11.
    Hastings, W.K.: Monte carlo sampling methods using markov chains and their applications. Biometrika 57(1), 97–109 (1970)CrossRefzbMATHGoogle Scholar
  12. 12.
    Imieliński, T., Lipski Jr., W.: Incomplete information in relational databases. JACM 31(4), 761–791 (1984)CrossRefzbMATHGoogle Scholar
  13. 13.
    Jampani, R., Xu, F., Wu, M., Perez, L.L., Jermaine, C., Haas, P.J.: MCDB: a monte carlo approach to managing uncertain data. In: SIGMOD (2008)Google Scholar
  14. 14.
    Kennedy, O., Koch, C.: PIP: a database system for great and small expectations. In: ICDE (2010)Google Scholar
  15. 15.
    Kulkarni, K., Michels, J.-E.: Temporal features in SQL:2011. SIGMOD Rec. 41(3), 34–43 (2012)CrossRefGoogle Scholar
  16. 16.
    Kumar, A., Tsotras, V.J., Faloutsos, C.: Designing access methods for bitemporal databases. IEEE TKDE 10(1), 1–20 (1998)Google Scholar
  17. 17.
    Lessa, D.: Temporal model for program debugging and scalable visualizations. Ph.D. thesis, University at Buffalo, SUNY (2013)Google Scholar
  18. 18.
    Lomet, D., Barga, R., Mokbel, M.F., Shegalov, G.: Transaction time support inside a database engine. In: ICDE, April 2006Google Scholar
  19. 19.
    Olken, F.: Efficient methods for calculating the success function of fixed-space replacement policies. Technical report, Lawrence Berkeley Lab., CA (USA) (1981)Google Scholar
  20. 20.
    Olteanu, D., Huang, J., Koch, C.: Lazy vs. eager query plans for tuple-independent probabilistic databases. In: ICDE, Sprout (2009)Google Scholar
  21. 21.
  22. 22.
    Šaltenis, S., Jensen, C.S.: R-tree based indexing of general spatio-temporal data. Technical report TR-45, TimeCenter (1999)Google Scholar
  23. 23.
    Seshadri, P., Pirahesh, H., Leung, T.Y.C.: Complex query decorrelation. In: ICDE, February 1996Google Scholar
  24. 24.
    Toman, D.: Point-based temporal extensions of sql and their efficient implementation. In: Etzion, O., Jajodia, S., Sripada, S. (eds.) Temporal Databases: Research and Practice. LNCS, vol. 1399, pp. 211–237. Springer, Heidelberg (1998) CrossRefGoogle Scholar
  25. 25.
    Transaction Processing Performance Council. Tpc-h benchmark specification. http://www.tpc.org/tpch/
  26. 26.
    Widom, J.: Trio: a system for integrated management of data, accuracy, and lineage. Technical report, Stanford InfoLab (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Oliver Kennedy
    • 1
    Email author
  • Ying Yang
    • 1
  • Jan Chomicki
    • 1
  • Ronny Fehling
    • 2
  • Zhen Hua Liu
    • 2
  • Dieter Gawlick
    • 2
  1. 1.University at Buffalo, SUNYBuffaloUSA
  2. 2.Oracle CorporationRedwood ShoresUSA

Personalised recommendations