Skip to main content

An Efficient Conditioning Method for Probabilistic Relational Databases

  • Conference paper
Web-Age Information Management (WAIM 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8485))

Included in the following conference series:

  • 5774 Accesses

Abstract

A probabilistic relational database is a probability distribution over a set of deterministic relational databases (namely, possible worlds). Efficient processing of updating information in probabilistic databases is required in several applications, such as sensor networking, data cleaning. As an important class of updating probabilistic databases, conditioning refines probability distribution of possible worlds, and possibly removing some of the possible worlds based on general knowledge, such as primary key constraints, functional dependencies and others. The existing methods for conditioning are exponential over the number of variables in the probabilistic database for an arbitrary constraint. In this paper, a constraint-based conditioning algorithm is proposed by only considering the variables in the given constraint without enumerating the truth values of all the variables in the formulae of tuples. Then we prove the correctness of the algorithm. The experimental study shows our proposed algorithm is more efficient comparing the work in the literatures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abiteboul, S., Grahne, G.: Update semantics for incomplete databases. In: Proceedings of the 11th International Conference on Very Large Data Bases, vol. 11, pp. 1–12. VLDB Endowment (1985)

    Google Scholar 

  2. Aggarwal, C.C., Yu, P.S.: A survey of uncertain data algorithms and applications. IEEE Transactions on Knowledge and Data Engineering 21(5), 609–623 (2009)

    Article  Google Scholar 

  3. Cheng, R., Chen, J., Xie, X.: Cleaning uncertain data with quality guarantees. In: Proceedings of the VLDB Endowment, vol. 1(1), pp. 722–735 (2008)

    Google Scholar 

  4. Elnahrawy, E., Nath, B.: Cleaning and querying noisy sensors. In: Proceedings of the 2nd ACM International Conference on Wireless Sensor Networks and Applications, pp. 78–87. ACM (2003)

    Google Scholar 

  5. Feng, H., Wang, H., Li, J., Gao, H.: Entity resolution on uncertain relations. In: Wang, J., Xiong, H., Ishikawa, Y., Xu, J., Zhou, J. (eds.) WAIM 2013. LNCS, vol. 7923, pp. 77–86. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  6. Fuhr, N., Rölleke, T.: A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Transactions on Information Systems (TOIS) 15(1), 32–66 (1997)

    Article  Google Scholar 

  7. Hegner, S.: Specification and implementation of programs for updating incomplete information databases. In: Proceedings of the Sixth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 146–158. ACM (1987)

    Google Scholar 

  8. Koch, C., Olteanu, D.: Conditioning probabilistic databases. In: Proceedings of the VLDB Endowment, vol. 1(1), pp. 313–325 (2008)

    Google Scholar 

  9. Soliman, M.A., Ilyas, I.F., Chang, K.C.C.: Probabilistic top-k and ranking-aggregate queries. ACM Transactions on Database Systems (TODS) 33(3), 13 (2008)

    Article  Google Scholar 

  10. Song, W., Yu, J.X., Cheng, H., Liu, H., He, J., Du, X.: Bayesian network structure learning from attribute uncertain data. In: Gao, H., Lim, L., Wang, W., Li, C., Chen, L. (eds.) WAIM 2012. LNCS, vol. 7418, pp. 314–321. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  11. Tang, R., Cheng, R., Wu, H., Bressan, S.: A framework for conditioning uncertain relational data. In: Liddle, S.W., Schewe, K.-D., Tjoa, A.M., Zhou, X. (eds.) DEXA 2012, Part II. LNCS, vol. 7447, pp. 71–87. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  12. Widom, J.: Trio: A system for integrated management of data, accuracy, and lineage. Technical Report (2004)

    Google Scholar 

  13. Moving rating system, http://infolab.stanford.edu/trio/code/index.html#examples

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhu, H., Zhang, C., Cao, Z., Tang, R., Yang, M. (2014). An Efficient Conditioning Method for Probabilistic Relational Databases. In: Li, F., Li, G., Hwang, Sw., Yao, B., Zhang, Z. (eds) Web-Age Information Management. WAIM 2014. Lecture Notes in Computer Science, vol 8485. Springer, Cham. https://doi.org/10.1007/978-3-319-08010-9_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08010-9_25

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08009-3

  • Online ISBN: 978-3-319-08010-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics