Definition
Scientific databases often deal with data that comes from multiple sources of varying quality, is heterogeneous, incomplete and inconsistent, and ridden with measurement errors. Uncertainty management deals with a set of techniques for modeling and representing the various uncertainties that arise in scientific data and to enable users to query the data. This entry describes the UII system [10] that addresses the issue of managing uncertainty in integrating scientific databases.
Historical Background
Distributed data integration is becoming increasingly popular in biomedical research and in scientific research in general. Its popularity is based on the realization that combining sources frequently lead to novel scientific discoveries that cannot be concluded from any single source in isolation. However, as more and more scientific data is shared and as tools are built to provide a common query interface for them, the scientists face the major problem of dealing with...
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsRecommended Reading
Barbará D, Garcia-Molina H, Porter D. The management of probabilistic data. IEEE Trans Knowl Data Eng. 1992;4(5):487–502.
Boulos J, Dalvi N, Mandhani B, Mathur S, Re C, Suciu D. Mystiq: a system for finding more answers by using probabilities. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2005. p. 891–3.
Cavallo R, Pittarelli M. The theory of probabilistic databases. In: Proceedings of the 13th International Conference on Very Large Data Bases; 1987. p. 71–81.
Dalvi N, Suciu D. Efficient query evaluation on probabilistic databases. In: Proceedings of the 26th International Conference on Very Large Data Bases; 2004. p. 864–75.
Deshpande A, Sunita Sarawagi. Probabilistic graphical models and their role in databases. In: Proceedings of the 33rd International Conference on Very Large Data Bases; 2007. p. 1435–6.
Dey D, Sarkar S. A probabilistic relational model and algebra. ACM Trans Database Syst. 1996;21(3):339–69.
Garofalakis MN, Brown KP, Franklin MJ, Hellerstein JM, Wang DZ, Michelakis E, Tancau L, Wu E, Jeffery SR, Aipperspach R. Probabilistic data management for pervasive computing: The data furnace project. IEEE Data Eng Bull. 2006;29(1):57–63.
Karger DR. A randomized fully polynomial time approximation scheme for the all terminal network reliability problem. In: Proceedings of the 27th Annual ACM Symposium on Theory of Computing; 1995. p. 11–7.
Lakshmanan LVS, Leone N, Ross R, Subrahmanian VS. Probview: a flexible probabilistic database system. ACM Trans Database Syst. 1997;22(3):419–69.
Louie B, Detwiler L, Dalvi N, Shaker R, Tarczy-Hornoch P, Suciu D. Incorporating uncertainty metrics into a general-purpose data integration system. In: Proceedings of the 19th International Conference on Scientific and Statistical Database Management; 2007. p. 19–28.
Re C, Dalvi N, Suciu D. Efficient top-k query evaluation on probabilistic data. In: Proceedings of the 23rd International Conference on Data Engineering; 2007. p. 886–95.
Sen P, Deshpande A. Representing and querying correlated tuples in probabilistic databases. In: Proceedings of the 23rd International Conference on Data Engineering; 2007. p. 596–605.
Shaker R, Mork P, Brockenbrough JS, Donelson L, Tarczy-Hornoch P. The biomediator system as a tool for integrating biologic databases on the web. In: Proceedings of the 4th International Workshop on Information Integration on the Web; 2004.
Singh S, Mayfield C, Mittal S, Prabhakar S, Hambrusch S, Shah R. Orion 2.0: native support for uncertain data. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2008. p. 1239–42.
Suciu D, Dalvi N. Foundations of probabilistic answers to queries. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2005. p. 963.
Tatusova TA, Madden TL. Blast 2 sequences - a new tool for comparing protein and nucleotide sequences. FEMS Microbiol Lett. 1999;174(2):247–50.
Wang K, Tarczy-Hornoch P, Shaker R, Mork P, Brinkley J. Biomediator data integration: beyond genomics to neuroscience data. In: Proceedings of the AMIA Annual Fall Symposium; 2005. p. 779–83.
Widom J. Trio: a system for integrated management of data, accuracy, and lineage. In: Proceedings of the 2nd Biennial Conference on Innovative Data Systems Research; 2005.
Woods DD, Patterson ES, Roth EM, Christoffersen K. Can we ever escape from data overload? a cognitive systems diagnosis. Cogn Technol Work. 2002;4(1):22–36.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Dalvi, N. (2018). Uncertainty Management in Scientific Database Systems. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_1302
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_1302
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering