Talk to Your Neighbour: A Belief Propagation Approach to Data Fusion

Laurenza, Eleonora

doi:10.1007/978-3-319-42972-4_38

Eleonora Laurenza⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 456))

Included in the following conference series:

International Conference on Soft Methods in Probability and Statistics

1621 Accesses

Abstract

Data fusion is a major task in data management. Frequently, different sources store data about the same real-world entities, however with conflicts in the values of their features. Data fusion aims at solving those conflicts in order to obtain a unique global view over those sources. Some solutions to the problem have been proposed in the database literature, yet they have a number of limitations for real cases: for example they leave too many alternatives to users or produce biased results. This paper proposes a novel algorithm for data fusion actually addressing conflict resolution in databases and overcoming some existing limitations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Notice that for a given \(i\), different candidates for a variable can also derive from the same sensor, in case of duplicate measures.

References

Bernstein PA (2003) Applying model management to classical meta data problems. CIDR 2003:209–220
Google Scholar
Bilke A, Bleiholder J, Böhm C, Draba K, Naumann F, Weis M (2005) Automatic data fusion with hummer. Proc VLDB
Google Scholar
Bleiholder J, Naumann F (2008) Data fusion. ACM Comput Surv (CSUR)
Google Scholar
Budd EC (1971) The creation of a microdata file for estimating the size distribution of income. Rev Income Wealth 17(4):317–333
Article Google Scholar
Christen P (2012) Data matching: concepts and techniques for record linkage, entity resolution, and duplicate detection. Springer
Google Scholar
Fujii T, van der Weide R (2011) Two-sample cross-tabulation
Google Scholar
Galindo-Legaria C (1994) Outerjoins as disjunctions. In: SIGMOD conference
Google Scholar
Gilula Z, McCulloch RE, Rossi PE (2006) A direct approach to data fusion. J Mark Res 43(1):73–83
Article Google Scholar
Halevy AY (2001) Answering queries using views: a survey. VLDB J (4)
Google Scholar
Hall DL (2004) Mathematical techniques in multisensor data fusion
Google Scholar
Kamakura WA, Wedel M (1997) Statistical data fusion for cross-tabulation. J Mark Res 485–498
Google Scholar
Koller D, Friedman N (2009) Probabilistic graphical models. The MIT Press
Google Scholar
Pearl J, Russel S (2011) Bayesian networks
Google Scholar
Raghavan S, Garcia-Molina H (2001) Integrating diverse information management systems: a brief survey. IEEE Data Eng Bull 24(4):44–52
Google Scholar
Rahm E, Do HH (2000) Data cleaning: problems and current approaches. IEEE Data Eng Bull 23(4):3–13
Google Scholar
Rässler S (2012) Statistical matching: a frequentist theory, practical applications, and alternative Bayesian approaches, vol 168. Springer
Google Scholar
Rubin DB (1986) Statistical matching using file concatenation with adjusted weights and multiple imputations. J Bus Econ Stat 4(1):87–94
MathSciNet Google Scholar
Ullman JD (1997) Information integration using logical views. In Database theory ICDT’97. Springer, pp 19–40
Google Scholar
Van der Puttan P, Kok JN, Gupta A (2002) Data fusion through statistical matching. Alfred P. Sloan School of Management, Massachusetts Institute of Technology
Google Scholar
Vantaggi B (2008) Statistical matching of multiple sources: a look through coherence. Int J Approximate Reasoning 49(3):701–711
Article MathSciNet MATH Google Scholar
Yan L, Tamer M (1999) Conflict tolerant queries in aurora. In: CoopIS. IEEE Computer Society, pp 279–290
Google Scholar

Download references

Author information

Authors and Affiliations

Sapienza University, Piazza Aldo Moro, 5, Rome, Italy
Eleonora Laurenza

Authors

Eleonora Laurenza
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eleonora Laurenza .

Editor information

Editors and Affiliations

Department of Statistical Sciences, Sapienza University of Rome, Rome, Italy
Maria Brigida Ferraro
Department of Statistical Sciences, Sapienza University of Rome, Roma, Italy
Paolo Giordani
Dept of Basic & Applied Sciences Engg, Sapienza University of Rome, Rome, Italy
Barbara Vantaggi
Dept of Stoc Metds,Polish Academy of Sci, Systems Research Institute, Warsaw, Poland
Marek Gagolewski
Dept of Statis&OR and MD, Univ de Oviedo, Oviedo, Spain
María Ángeles Gil
Dept of Stoc Metds,Polish Aca of Science, Systems Res Inst, Warsaw, Poland
Przemysław Grzegorzewski
Sys Res Inti, Dept of Stoch Methods, Polish Acadmy Sci in Warsaw, Warsaw, Poland
Olgierd Hryniewicz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Laurenza, E. (2017). Talk to Your Neighbour: A Belief Propagation Approach to Data Fusion. In: Ferraro, M., et al. Soft Methods for Data Science. SMPS 2016. Advances in Intelligent Systems and Computing, vol 456. Springer, Cham. https://doi.org/10.1007/978-3-319-42972-4_38

Download citation

DOI: https://doi.org/10.1007/978-3-319-42972-4_38
Published: 30 July 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42971-7
Online ISBN: 978-3-319-42972-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics