Abstract
Data fusion is a major task in data management. Frequently, different sources store data about the same real-world entities, however with conflicts in the values of their features. Data fusion aims at solving those conflicts in order to obtain a unique global view over those sources. Some solutions to the problem have been proposed in the database literature, yet they have a number of limitations for real cases: for example they leave too many alternatives to users or produce biased results. This paper proposes a novel algorithm for data fusion actually addressing conflict resolution in databases and overcoming some existing limitations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Notice that for a given \(i\), different candidates for a variable can also derive from the same sensor, in case of duplicate measures.
References
Bernstein PA (2003) Applying model management to classical meta data problems. CIDR 2003:209–220
Bilke A, Bleiholder J, Böhm C, Draba K, Naumann F, Weis M (2005) Automatic data fusion with hummer. Proc VLDB
Bleiholder J, Naumann F (2008) Data fusion. ACM Comput Surv (CSUR)
Budd EC (1971) The creation of a microdata file for estimating the size distribution of income. Rev Income Wealth 17(4):317–333
Christen P (2012) Data matching: concepts and techniques for record linkage, entity resolution, and duplicate detection. Springer
Fujii T, van der Weide R (2011) Two-sample cross-tabulation
Galindo-Legaria C (1994) Outerjoins as disjunctions. In: SIGMOD conference
Gilula Z, McCulloch RE, Rossi PE (2006) A direct approach to data fusion. J Mark Res 43(1):73–83
Halevy AY (2001) Answering queries using views: a survey. VLDB J (4)
Hall DL (2004) Mathematical techniques in multisensor data fusion
Kamakura WA, Wedel M (1997) Statistical data fusion for cross-tabulation. J Mark Res 485–498
Koller D, Friedman N (2009) Probabilistic graphical models. The MIT Press
Pearl J, Russel S (2011) Bayesian networks
Raghavan S, Garcia-Molina H (2001) Integrating diverse information management systems: a brief survey. IEEE Data Eng Bull 24(4):44–52
Rahm E, Do HH (2000) Data cleaning: problems and current approaches. IEEE Data Eng Bull 23(4):3–13
Rässler S (2012) Statistical matching: a frequentist theory, practical applications, and alternative Bayesian approaches, vol 168. Springer
Rubin DB (1986) Statistical matching using file concatenation with adjusted weights and multiple imputations. J Bus Econ Stat 4(1):87–94
Ullman JD (1997) Information integration using logical views. In Database theory ICDT’97. Springer, pp 19–40
Van der Puttan P, Kok JN, Gupta A (2002) Data fusion through statistical matching. Alfred P. Sloan School of Management, Massachusetts Institute of Technology
Vantaggi B (2008) Statistical matching of multiple sources: a look through coherence. Int J Approximate Reasoning 49(3):701–711
Yan L, Tamer M (1999) Conflict tolerant queries in aurora. In: CoopIS. IEEE Computer Society, pp 279–290
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing Switzerland
About this paper
Cite this paper
Laurenza, E. (2017). Talk to Your Neighbour: A Belief Propagation Approach to Data Fusion. In: Ferraro, M., et al. Soft Methods for Data Science. SMPS 2016. Advances in Intelligent Systems and Computing, vol 456. Springer, Cham. https://doi.org/10.1007/978-3-319-42972-4_38
Download citation
DOI: https://doi.org/10.1007/978-3-319-42972-4_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42971-7
Online ISBN: 978-3-319-42972-4
eBook Packages: EngineeringEngineering (R0)