Skip to main content

Talk to Your Neighbour: A Belief Propagation Approach to Data Fusion

  • Conference paper
  • First Online:
Soft Methods for Data Science (SMPS 2016)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 456))

Included in the following conference series:

  • 1621 Accesses

Abstract

Data fusion is a major task in data management. Frequently, different sources store data about the same real-world entities, however with conflicts in the values of their features. Data fusion aims at solving those conflicts in order to obtain a unique global view over those sources. Some solutions to the problem have been proposed in the database literature, yet they have a number of limitations for real cases: for example they leave too many alternatives to users or produce biased results. This paper proposes a novel algorithm for data fusion actually addressing conflict resolution in databases and overcoming some existing limitations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Notice that for a given \(i\), different candidates for a variable can also derive from the same sensor, in case of duplicate measures.

References

  1. Bernstein PA (2003) Applying model management to classical meta data problems. CIDR 2003:209–220

    Google Scholar 

  2. Bilke A, Bleiholder J, Böhm C, Draba K, Naumann F, Weis M (2005) Automatic data fusion with hummer. Proc VLDB

    Google Scholar 

  3. Bleiholder J, Naumann F (2008) Data fusion. ACM Comput Surv (CSUR)

    Google Scholar 

  4. Budd EC (1971) The creation of a microdata file for estimating the size distribution of income. Rev Income Wealth 17(4):317–333

    Article  Google Scholar 

  5. Christen P (2012) Data matching: concepts and techniques for record linkage, entity resolution, and duplicate detection. Springer

    Google Scholar 

  6. Fujii T, van der Weide R (2011) Two-sample cross-tabulation

    Google Scholar 

  7. Galindo-Legaria C (1994) Outerjoins as disjunctions. In: SIGMOD conference

    Google Scholar 

  8. Gilula Z, McCulloch RE, Rossi PE (2006) A direct approach to data fusion. J Mark Res 43(1):73–83

    Article  Google Scholar 

  9. Halevy AY (2001) Answering queries using views: a survey. VLDB J (4)

    Google Scholar 

  10. Hall DL (2004) Mathematical techniques in multisensor data fusion

    Google Scholar 

  11. Kamakura WA, Wedel M (1997) Statistical data fusion for cross-tabulation. J Mark Res 485–498

    Google Scholar 

  12. Koller D, Friedman N (2009) Probabilistic graphical models. The MIT Press

    Google Scholar 

  13. Pearl J, Russel S (2011) Bayesian networks

    Google Scholar 

  14. Raghavan S, Garcia-Molina H (2001) Integrating diverse information management systems: a brief survey. IEEE Data Eng Bull 24(4):44–52

    Google Scholar 

  15. Rahm E, Do HH (2000) Data cleaning: problems and current approaches. IEEE Data Eng Bull 23(4):3–13

    Google Scholar 

  16. Rässler S (2012) Statistical matching: a frequentist theory, practical applications, and alternative Bayesian approaches, vol 168. Springer

    Google Scholar 

  17. Rubin DB (1986) Statistical matching using file concatenation with adjusted weights and multiple imputations. J Bus Econ Stat 4(1):87–94

    MathSciNet  Google Scholar 

  18. Ullman JD (1997) Information integration using logical views. In Database theory ICDT’97. Springer, pp 19–40

    Google Scholar 

  19. Van der Puttan P, Kok JN, Gupta A (2002) Data fusion through statistical matching. Alfred P. Sloan School of Management, Massachusetts Institute of Technology

    Google Scholar 

  20. Vantaggi B (2008) Statistical matching of multiple sources: a look through coherence. Int J Approximate Reasoning 49(3):701–711

    Article  MathSciNet  MATH  Google Scholar 

  21. Yan L, Tamer M (1999) Conflict tolerant queries in aurora. In: CoopIS. IEEE Computer Society, pp 279–290

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eleonora Laurenza .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing Switzerland

About this paper

Cite this paper

Laurenza, E. (2017). Talk to Your Neighbour: A Belief Propagation Approach to Data Fusion. In: Ferraro, M., et al. Soft Methods for Data Science. SMPS 2016. Advances in Intelligent Systems and Computing, vol 456. Springer, Cham. https://doi.org/10.1007/978-3-319-42972-4_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-42972-4_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-42971-7

  • Online ISBN: 978-3-319-42972-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics