Abstract
To gain knowledge out of your data, your data has to be of high quality. Bad data quality becomes more and more the problem for companies, who start to exploit their data stocks. This article will show the main obstacles on the way to perfect data quality. It is based on our experience to improve data quality in large customer or business partner databases. The examples mentioned in this paper show data defects we have found during our daily work. There are also some notes how to improve data quality and avoid data defects.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Chen, P.P.: The entity-relationship model—toward a unified view of data. ACM Transactions on Database Systems, 9–36 (March 1976)
Kent, W.: A simple guide to five normal forms in relational database theory. Communications of the ACM 26, 120–125 (1983)
Lee, Y.W., Strong, D.M.: Knowing-Why About Data Processes and Data Quality. Journal of Management Information & Systems 20(3), 13–39 (Winter 2003-4)
Strong, D.M., Lee, Y.W., Wang, R.Y.: Data Quality in Context. Communications of the ACM, 103–110 (May 1997)
Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. Journal of Management Information Systems 12(4), 5–34 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Schmid, J. (2004). The Main Steps to Data Quality. In: Perner, P. (eds) Advances in Data Mining. ICDM 2004. Lecture Notes in Computer Science(), vol 3275. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30185-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-30185-1_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24054-9
Online ISBN: 978-3-540-30185-1
eBook Packages: Computer ScienceComputer Science (R0)