An Efficient Conditioning Method for Probabilistic Relational Databases

Zhu, Hong; Zhang, Caicai; Cao, Zhongsheng; Tang, Ruiming; Yang, Mengyuan

doi:10.1007/978-3-319-08010-9_25

Hong Zhu²⁰,
Caicai Zhang²⁰,
Zhongsheng Cao²⁰,
Ruiming Tang²¹ &
…
Mengyuan Yang²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8485))

Included in the following conference series:

International Conference on Web-Age Information Management

5774 Accesses

Abstract

A probabilistic relational database is a probability distribution over a set of deterministic relational databases (namely, possible worlds). Efficient processing of updating information in probabilistic databases is required in several applications, such as sensor networking, data cleaning. As an important class of updating probabilistic databases, conditioning refines probability distribution of possible worlds, and possibly removing some of the possible worlds based on general knowledge, such as primary key constraints, functional dependencies and others. The existing methods for conditioning are exponential over the number of variables in the probabilistic database for an arbitrary constraint. In this paper, a constraint-based conditioning algorithm is proposed by only considering the variables in the given constraint without enumerating the truth values of all the variables in the formulae of tuples. Then we prove the correctness of the algorithm. The experimental study shows our proposed algorithm is more efficient comparing the work in the literatures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abiteboul, S., Grahne, G.: Update semantics for incomplete databases. In: Proceedings of the 11th International Conference on Very Large Data Bases, vol. 11, pp. 1–12. VLDB Endowment (1985)
Google Scholar
Aggarwal, C.C., Yu, P.S.: A survey of uncertain data algorithms and applications. IEEE Transactions on Knowledge and Data Engineering 21(5), 609–623 (2009)
Article Google Scholar
Cheng, R., Chen, J., Xie, X.: Cleaning uncertain data with quality guarantees. In: Proceedings of the VLDB Endowment, vol. 1(1), pp. 722–735 (2008)
Google Scholar
Elnahrawy, E., Nath, B.: Cleaning and querying noisy sensors. In: Proceedings of the 2nd ACM International Conference on Wireless Sensor Networks and Applications, pp. 78–87. ACM (2003)
Google Scholar
Feng, H., Wang, H., Li, J., Gao, H.: Entity resolution on uncertain relations. In: Wang, J., Xiong, H., Ishikawa, Y., Xu, J., Zhou, J. (eds.) WAIM 2013. LNCS, vol. 7923, pp. 77–86. Springer, Heidelberg (2013)
Chapter Google Scholar
Fuhr, N., Rölleke, T.: A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Transactions on Information Systems (TOIS) 15(1), 32–66 (1997)
Article Google Scholar
Hegner, S.: Specification and implementation of programs for updating incomplete information databases. In: Proceedings of the Sixth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 146–158. ACM (1987)
Google Scholar
Koch, C., Olteanu, D.: Conditioning probabilistic databases. In: Proceedings of the VLDB Endowment, vol. 1(1), pp. 313–325 (2008)
Google Scholar
Soliman, M.A., Ilyas, I.F., Chang, K.C.C.: Probabilistic top-k and ranking-aggregate queries. ACM Transactions on Database Systems (TODS) 33(3), 13 (2008)
Article Google Scholar
Song, W., Yu, J.X., Cheng, H., Liu, H., He, J., Du, X.: Bayesian network structure learning from attribute uncertain data. In: Gao, H., Lim, L., Wang, W., Li, C., Chen, L. (eds.) WAIM 2012. LNCS, vol. 7418, pp. 314–321. Springer, Heidelberg (2012)
Chapter Google Scholar
Tang, R., Cheng, R., Wu, H., Bressan, S.: A framework for conditioning uncertain relational data. In: Liddle, S.W., Schewe, K.-D., Tjoa, A.M., Zhou, X. (eds.) DEXA 2012, Part II. LNCS, vol. 7447, pp. 71–87. Springer, Heidelberg (2012)
Chapter Google Scholar
Widom, J.: Trio: A system for integrated management of data, accuracy, and lineage. Technical Report (2004)
Google Scholar
Moving rating system, http://infolab.stanford.edu/trio/code/index.html#examples

Download references

Author information

Authors and Affiliations

Huazhong University of Science and Technology, China
Hong Zhu, Caicai Zhang, Zhongsheng Cao & Mengyuan Yang
National University of Singapore, Singapore
Ruiming Tang

Authors

Hong Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Caicai Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhongsheng Cao
View author publications
You can also search for this author in PubMed Google Scholar
Ruiming Tang
View author publications
You can also search for this author in PubMed Google Scholar
Mengyuan Yang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computing, University of Utah, 50 S. Central Campus Drive, 84112, Salt Lake City,, UT, USA
Feifei Li
Department of Computer Science, Tsinghua University, 100084, Beijing, China
Guoliang Li
POSTECH, Republic of Korea
Seung-won Hwang
Shanghai Key Laboratory of Scalable Computing and Systems, Department of Computer Science and Engineering,, Shanghai Jiao Tong University, China
Bin Yao
Advanced Digital Sciences Center (ADSC), 138632, Singapore, Singapore
Zhenjie Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhu, H., Zhang, C., Cao, Z., Tang, R., Yang, M. (2014). An Efficient Conditioning Method for Probabilistic Relational Databases. In: Li, F., Li, G., Hwang, Sw., Yao, B., Zhang, Z. (eds) Web-Age Information Management. WAIM 2014. Lecture Notes in Computer Science, vol 8485. Springer, Cham. https://doi.org/10.1007/978-3-319-08010-9_25

Download citation

DOI: https://doi.org/10.1007/978-3-319-08010-9_25
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08009-3
Online ISBN: 978-3-319-08010-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics