Advertisement

Belief Revision in Uncertain Data Integration

  • Fereidoon SadriEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9093)

Abstract

This paper studies the problem of integrating probabilistic uncertain information. Certain constraints are imposed by the semantics of integration, but there is no guarantee that they are satisfied in practical situations. We present a Bayesian-based approach to revise the probability distribution of the information in the sources in a systematic way to remedy this difficulty. The revision step is similar in spirit to tasks like data cleaning and record linkage and should be carried out before integration can be achieved for probabilistic uncertain data.

Keywords

Information integration Uncertain data Probabilistic data Belief revision 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abiteboul, S., Kanellakis, P.C., Grahne, G.: On the representation and querying of sets of possible worlds. In: Proceedings of ACM SIGMOD International Conference on Managementof Data, pp. 34–48 (1987)Google Scholar
  2. 2.
    Agrawal, P., Sarma, A.D., Ullman, J.D., Widom, J.: Foundations of uncertain-data integration. Proceedings of the VLDB Endowment 3(1), 1080–1090 (2010)Google Scholar
  3. 3.
    Antova, L., Jansen, T., Koch, C., Olteanu, D.: Fast and simple relational processing of uncertain data. In: Proceedings of IEEE International Conference on Data Engineering, pp. 983–992 (2008)Google Scholar
  4. 4.
    Antova, L., Koch, C., Olteanu, D.: 10\(^{\text{10 }^{\text{6 }}}\) worlds and beyond: Efficient representation and processing of incomplete information. In: Proceedingsof IEEE International Conference on Data Engineering, pp. 606–615 (2007)Google Scholar
  5. 5.
    Chen, D., Chirkova, R., Sadri, F., Salo, T.J.: Query optimization in information integration. Acta Informatica 50(4), 257–287 (2013)CrossRefzbMATHMathSciNetGoogle Scholar
  6. 6.
    Dalvi, N.N., Ré, C., Suciu, D.: Probabilistic databases: diamonds in the dirt. Communications of the ACM 52(7), 86–94 (2009)Google Scholar
  7. 7.
    Dong, X.L., Berti-Equille, L., Srivastava, D.: Integrating conflicting data: The role of source dependence. PVLDB 2(1), 550–561 (2009)Google Scholar
  8. 8.
    Dong, X.L., Halevy, A., Yu, C.: Data integration with uncertainty. In: Proceedings of International Conference on Very Large Databases, pp. 687–698 (2007)Google Scholar
  9. 9.
    Dong, X.L., Halevy, A.Y., Yu, C.: Data integration with uncertainty. The VLDB Journal 18(2), 469–500 (2009)Google Scholar
  10. 10.
    Dong, X.L., Saha, B., Srivastava, D.: Less is more: Selecting sources wisely for integration. Proceedings of the VLDB Endowment 6(2), 37–48 (2012)Google Scholar
  11. 11.
    Eshmawi, A.A., Sadri, F.: Information integration with uncertainty. In: Proceedings of International Database Engineering and Applications, IDEAS, pp. 284–291 (2009)Google Scholar
  12. 12.
    Galland, A., Abiteboul, S., Marian, A., Senellart, P.: Corroborating information from disagreeing views. In: Proceedings of ACM InternationalConference on Web Search and Data Mining, pp. 131–140 (2010)Google Scholar
  13. 13.
    Haas, L.: Beauty and the Beast: The Theory and Practice of Information Integration. In: Schwentick, T., Suciu, D. (eds.) ICDT 2007. LNCS, vol. 4353, pp. 28–43. Springer, Heidelberg (2006) CrossRefGoogle Scholar
  14. 14.
    Halevy, A.Y., Ashish, N., Bitton, D., Carey, M.J., Draper, D., Pollock, J., Rosenthal, A., Sikka, V.: Enterprise information integration: successes, challenges and controversies. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 778–787 (2005)Google Scholar
  15. 15.
    Halevy, A.Y., Rajaraman, A., Ordille, J.J.: Data integration: The teenage years. In: Proceedings of International Conference on Very Large Databases, pp. 9–16 (2006)Google Scholar
  16. 16.
    Jeffrey, R.: The Logic of Decision. McGraw-Hill (1965)Google Scholar
  17. 17.
    Magnani, M., Montesi, D.: Uncertainty in data integration: current approaches and open problems. In: Proceedings of VLDB Workshop on Managementof Uncertain Data, pp. 18–32 (2007)Google Scholar
  18. 18.
    Magnani, M., Montesi, D.: A survey on uncertainty management in data integration. ACM Journal of Data and Information Quality 2(1) (2010)Google Scholar
  19. 19.
    Olteanu, D., Huang, J., Koch, C.: SPROUT: Lazy vs. eager query plans for tuple-independent probabilistic databases. In: Proceedings of IEEE International Conference on Data Engineering, pp. 640–651 (2009)Google Scholar
  20. 20.
    Pochampally, R., Sarma, A.D., Dong, X.L., Meliou, A., Srivastava, D.:. Fusing data with correlations. In: Proceedings of ACM SIGMODInternational Conference on Management of Data, pp. 433–444 (2014)Google Scholar
  21. 21.
    Re, C., Dalvi, N.N., Suciu, D.: Efficient top-k query evaluation on probabilistic data. In: Proceedings of IEEE International Conference on DataEngineering, pp. 886–895 (2007)Google Scholar
  22. 22.
    Sadri, F.: On the foundations of probabilistic information integration. In: Proceedings of International Conference on Information and Knowledge Management, pp. 882–891 (2012)Google Scholar
  23. 23.
    Sadri, F., Tallur, G: Integration of probabilistic uncertain information (2014) (manuscript)Google Scholar
  24. 24.
    Sarma, A.D., Benjelloun, O., Halevy, A.Y., Nabar, S.U., Widom, J.: Representing uncertain data: models, properties, and algorithms. The VLDB Journal 18(5), 989–1019 (2009)Google Scholar
  25. 25.
    Sarma, A.D., Benjelloun, O., Halevy, A.Y., Widom, J.: Working models for uncertain data. In: Proceedings of IEEE International Conferenceon Data Engineering, p. 7 (2006)Google Scholar
  26. 26.
    Sen, P., Deshpande, A.: Representing and querying correlated tuples in probabilistic databases. In: Proceedings of IEEE International Conference onData Engineering, pp. 596–605 (2007)Google Scholar
  27. 27.
    Shafer, G.: Jeffrey’s rule of conditioning. Philosophy of Science 48(3), 337–362 (1981)CrossRefMathSciNetGoogle Scholar
  28. 28.
    Zhao, B., Rubinstein, B.I.P., Gemmell, J., Han, J.: A bayesian approach to discovering truth from conflicting sources for data integration. Proceedings of the VLDB Endowment 5(6), 550–561 (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of North CarolinaGreensboroUSA

Personalised recommendations