Synonyms
Data anomalies; Data errors; Data inconsistencies; Data problems; Data quality problems
Definition
Data conflicts are deviations between data intended to capture the same state of a real-world entity. Data with conflicts are often called “dirty” data and can mislead analysis performed on it. In case of data conflicts, data cleaning is needed in order to improve the data quality and to avoid wrong analysis results. With an understanding of different kinds of data conflicts and their characteristics, corresponding techniques for data cleaning can be developed.
Historical Background
Statisticians were probably the first who had to face data conflicts on a large scale. Early applications, which needed intensive resolution of data conflicts, were statistical surveys in the areas of governmental administration, public health, and scientific experiments. In 1946, Halbert L. Dunn already observed the problem of duplicates in data records of a person’s life captured at different places...
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsRecommended Reading
Barateiro J, Galhardas H. A survey of data quality tools. Datenbank-Spektrum. 2005;14(15–21):48.
Batini C, Scannapieco M. Data quality – concepts, methodologies and techniques. Berlin: Springer; 2006.
Dunn HL. Record linkage. Am J Public Health. 1946;36(12):1412–6.
Fellegi IP, Sunter AB. A theory for record linkage. J Am Stat Assoc. 1969;64(328):1183–210.
Kim W, Choi B-J, Kim S-K, Lee D. A taxonomy of dirty data. Data Min Knowl Discov. 2003;7(1):81–99.
Rahm E, Do H-H. Data cleaning – problems and current approaches. IEEE Techn Bull Data Eng. 2000;23(4):3–13.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Do, HH. (2018). Data Conflicts. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_97
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_97
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering