Abstract
Lineage is important in uncertain data management since it can be used for finding out which part of data contributes to a result and computing the probability of the result. Nonetheless, the existing works consider an uncertain tuple as a set of tuples that can be stored in a relational table. Lineage can derive each tuple in the table, with which one can only find out the tuples rather than specific attributes that contribute to the result. If uncertain tuples have multiple uncertain attributes, for a result tuple with low probability, users cannot know which attribute is the main cause of it. In this paper, we propose an approach to model uncertain data. Compared with the alternative way based on the relational model, our model achieves a low maintenance cost and avoids a large number of redundant storage and join operations. Based on our model, some operations are defined for querying data, generating lineage and computing probability of results. Then we discuss how to correctly compute probability with lineage and an algorithm is proposed to transform lineage for correct probability computation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Das Sarma, A., Theobald, M., Widom, J.: Exploiting lineage for confidence computation in uncertain and probabilistic databases. In: ICDE, pp. 1023–1032 (2008)
Benjelloun, O., Sarma, A.D., Halevy, A., Theobald, M., Widom, J.: Databases with uncertainty and lineage. VLDB J. 17(2), 243–264 (2008)
Das Sarma, A., Theobald, M., Widom, J.: Working models for uncertain data. In: ICDE (2006)
Green, T.J., Tannen, V.: Models for incomplete and probabilistic information. In: Grust, T., Höpfner, H., Illarramendi, A., Jablonski, S., Fischer, F., Müller, S., Patranjan, P.-L., Sattler, K.-U., Spiliopoulou, M., Wijsen, J. (eds.) EDBT 2006. LNCS, vol. 4254, pp. 278–296. Springer, Heidelberg (2006)
Cui, Y., Widom, J., Wiener, J.L.: Tracing the lineage of view data in a warehousing environment. ACM Trans. Database Syst. 25(2), 179–227 (2000)
Buneman, P., Tan, W.C.: Provenance in databases. In: SIGMOD, pp: 1171–1173 (2007)
Kanagal, B., Deshpande, A.: Lineage processing over correlated probabilistic databases. In: SIGMOD, pp. 675–686 (2010)
Ré, C., Suciu, D.: Approximate lineage for probabilistic databases. In: VLDB, pp. 797–808 (2008)
Amsterdamer, Y., Deutch, D., Milo, T., Tannen, V.: On provenance minimization. ACM Trans. Database Syst. 37(4), 3123 (2011)
Sen, P., Deshpande, A., Getoor, L.: Read-once functions and query evaluation in probabilistic databases. In: VLDB, pp: 1068–1079 (2010)
Dalvi, N., Ré, C., Suciu, D.: Queries and materialized views on probabilistic databases. J. Comput. Syst. Sci. 77(3), 473–490 (2011)
Akbarinia, R., Valduriez, P., Verger, G.: Efficient evaluation of sum queries over probabilistic data. IEEE Trans. Knowl. Data Eng. 25(4), 764–775 (2013)
Murthy, R., Ikeda, R., Widom, J.: Making aggregation work in uncertain and probabilistic databases. IEEE Trans. Knowl. Data Eng. 23(8), 1261–1273 (2011)
Cormode, G., Srivastava, D., Shen, E., Yu, T.: Aggregate query answering on possibilistic data with cardinality constraints. In: ICDE (2012)
Fink, R., Huang, J., Olteanu, D.: Anytime approximation in probabilistic databases. VLDB J. 22(6), 823–848 (2013)
Sen, P., Deshpande, A., Getoor, L.: Representing tuple and attribute uncertainty in probabilistic databases. In: ICDM Workshops, pp. 507–512 (2007)
Singh, S., Mayfield, C., Shah, R., Prabhakar, S., Hambrusch, S., Neville, J., Cheng, R.: Database support for probabilistic attributes and tuples. In: ICDE, pp. 1053–1061 (2008)
Peng, Z.Y., Kambayashi, Y.: Deputy mechanisms for object-oriented databases. In: ICDE, pp. 333–340 (1995)
Bachman, C.W., Daya, M.: The role concept in data models. In: VLDB, pp. 464–476 (1977)
Acknowledgements
This work was partially supported by the NNSFC under grant No. 61232002 and No. 61202033, NSFHP under grant No. 2011CDB448, Ph.D. SFWU under grant No. 2012211020207.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, L., Wang, L., Peng, Z. (2015). A Working Model for Uncertain Data with Lineage. In: Johannesson, P., Lee, M., Liddle, S., Opdahl, A., Pastor López, Ó. (eds) Conceptual Modeling. ER 2015. Lecture Notes in Computer Science(), vol 9381. Springer, Cham. https://doi.org/10.1007/978-3-319-25264-3_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-25264-3_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25263-6
Online ISBN: 978-3-319-25264-3
eBook Packages: Computer ScienceComputer Science (R0)