Skip to main content

Provenance Based Conflict Handling Strategies

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7240))

Abstract

A fundamental task in data integration is data fusion, the process of fusing multiple records representing the same real-world object into a consistent representation; data fusion involves the resolution of possible conflicts between data coming from different sources; several high level strategies to handle inconsistent data have been described and classified in [8].

The MOMIS Data Integration System [2] uses either conflict avoiding strategies (such as the trust your friends strategy which takes the value of a preferred source) and resolution strategies (such as the meet in the middle strategy which takes an average value).

In this paper we consider other strategies proposed in literature to handle inconsistent data and we discuss how they can be adopted and extended in the MOMIS Data Integration System. First of all, we consider the methods introduced by the Trio system [1,6] and based on the idea to tackle data conflicts by explicitly including information on provenance to represent uncertainty and use it to answer queries. Other possible strategies are to ignore conflicting values at the global level (i.e., only consistent values are considered) and to consider at the global level all conflicting values.

The original contribution of this paper is a provenance-based framework which includes all the above mentioned conflict handling strategies and use them as different search strategies for querying the integrated sources.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, P., Benjelloun, O., Sarma, A.D., Hayworth, C., Nabar, S., Sugihara, T., Widom, J.: Trio: a system for data, uncertainty, and lineage. In: VLDB 2006: Proceedings of the 32nd International Conference on Very Large Data Bases, pp. 1151–1154. VLDB Endowment (2006)

    Google Scholar 

  2. Beneventano, D., Bergamaschi, S., Guerra, F., Orsini, M.: Data integration. In: Embley, D., Thalheim, B. (eds.) Handbook of Conceptual Modelling. Springer, Heidelberg (2010), http://dbgroup.unimo.it/SSE/SSE.pdf

    Google Scholar 

  3. Beneventano, D., Bergamaschi, S., Guerra, F., Vincini, M.: Synthesizing an integrated ontology. IEEE Internet Computing 7(5), 42–51 (2003)

    Article  Google Scholar 

  4. Beneventano, D., Dannoui, A.R., Sala, A.: Data lineage in the momis data fusion system. In: ICDE Workshops of the 27th International Conference on Data Engineering, ICDE 2011, Hannover, Germany, April 11-16, pp. 53–58 (2011)

    Google Scholar 

  5. Benjelloun, O., Sarma, A.D., Halevy, A., Widom, J.: Uldbs: databases with uncertainty and lineage. In: VLDB 2006: Proceedings of the 32nd International Conference on Very Large Data Bases, pp. 953–964. VLDB Endowment (2006)

    Google Scholar 

  6. Benjelloun, O., Sarma, A.D., Hayworth, C., Widom, J.: An introduction to uldbs and the trio system. IEEE Data Eng. Bull. 29(1), 5–16 (2006)

    Google Scholar 

  7. Bergamaschi, S., Castano, S., Vincini, M., Beneventano, D.: Semantic integration of heterogeneous information sources. Data Knowl. Eng. 36(3), 215–249 (2001)

    Article  MATH  Google Scholar 

  8. Bleiholder, J., Naumann, F.: Data fusion. ACM Comput. Surv. 41(1), 1–41 (2008)

    Article  Google Scholar 

  9. Chomicki, J.: Consistent Query Answering: Five Easy Pieces. In: Schwentick, T., Suciu, D. (eds.) ICDT 2007. LNCS, vol. 4353, pp. 1–17. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  10. Cui, Y., Widom, J.: Lineage tracing for general data warehouse transformations. The VLDB Journal 12(1), 41–58 (2003)

    Article  Google Scholar 

  11. Cui, Y., Widom, J., Wiener, J.L.: Tracing the lineage of view data in a warehousing environment. ACM Trans. Database Syst. 25(2), 179–227 (2000)

    Article  Google Scholar 

  12. Glavic, B., Alonso, G.: Perm: Processing provenance and data on the same data model through query rewriting. In: Proceedings of the 2009 IEEE International Conference on Data Engineering, pp. 174–185. IEEE Computer Society, Washington, DC (2009)

    Chapter  Google Scholar 

  13. Halevy, A., Li, C.: Information integration research: Summary of nsf idm workshop breakout session. In: NSF IDM Workshop (2003)

    Google Scholar 

  14. Halevy, A., Rajaraman, A., Ordille, J.: Data integration: the teenage years. In: VLDB 2006: Proceedings of the 32nd International Conference on Very Large Data Bases, pp. 9–16. VLDB Endowment (2006)

    Google Scholar 

  15. Lenzerini, M.: Data integration: A theoretical perspective. In: PODS, pp. 233–246 (2002)

    Google Scholar 

  16. Naumann, F., Freytag, J.C., Leser, U.: Completeness of integrated information sources. Inf. Syst. 29(7), 583–615 (2004)

    Article  Google Scholar 

  17. Sarma, A.D., Benjelloun, O., Halevy, A., Nabar, S., Widom, J.: Representing uncertain data: models, properties, and algorithms. The VLDB Journal 18(5), 989–1019 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Beneventano, D. (2012). Provenance Based Conflict Handling Strategies. In: Yu, H., Yu, G., Hsu, W., Moon, YS., Unland, R., Yoo, J. (eds) Database Systems for Advanced Applications. DASFAA 2012. Lecture Notes in Computer Science, vol 7240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29023-7_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-29023-7_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29022-0

  • Online ISBN: 978-3-642-29023-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics