Encyclopedia of Big Data

Living Edition
| Editors: Laurie A. Schintler, Connie L. McNeely

Data Quality Management

  • Erik W. KuilerEmail author
Living reference work entry
DOI: https://doi.org/10.1007/978-3-319-32001-4_317-1

Introduction

With the increasing availability of Big Data and their attendant analytics, the importance of data quality management has increased. Poor data quality represents one of the greatest hurdles to effective data analytics, computational linguistics, machine learning, and artificial intelligence. If the data are inaccurate, incomprehensible, or unusable, it does not matter how sophisticated our algorithms and paradigms are, or how intelligent our “machines.”

J. M. Juran provides a definition of data quality that is applicable to current Big Data environments: “Data are of high quality if they are fit for their intended use in operations, decision making, and planning” (Juran and Godfrey 1999, p. 34.9). In this context, quality means that Big Data are relevant to their intended uses and are of sufficient detail and quantity, with a high degree of accuracy and completeness, of known provenance, consistent with their metadata, and presented in appropriate ways.

Big Data provide...

This is a preview of subscription content, log in to check access.

Further Readings

  1. Acock, A. C. (2005). Working with missing values. Journal of Marriage and Family, 67, 1012–1028.CrossRefGoogle Scholar
  2. Allison, P. A. (2002). Missing data. Thousand Oaks: Sage Publications.CrossRefGoogle Scholar
  3. Juran, J. M., & Godfrey, A. B. (1999). Juran’s quality handbook (Fifth ed.). New York: McGraw-Hill.Google Scholar
  4. Labouseur, A. G., & Matheus, C. (2017). An introduction to dynamic data quality challenges. ACM Journal of Data and Information Quality, 8(2), 1–3.CrossRefGoogle Scholar
  5. Little, R. J. A., & Rubin, D. B. (1997). Statistical analysis with missing data. New York: Wiley.Google Scholar
  6. Pipino, L. L. Y. W. L., & Wang, R. Y. (2002). Data quality assessment. Communications of the ACM, 45(4), 211–218.CrossRefGoogle Scholar
  7. Saunders, J. A., Morrow-Howell, N., Spitznagel, E., Dore, P., Proctor, E. K., & Pescarino, R. (2006). Imputing missing data: A comparison of methods for social workers. Social Work Research, 30(1), 19–30.CrossRefGoogle Scholar
  8. Strong, D. M., Lee, Y. W., & Wang, R. Y. (1997). Data quality in context. Communications of the ACM, 40(5), 103–110.CrossRefGoogle Scholar
  9. Truong, H.-L., Murguzur, A., & Yang, E. (2018). Challenges in enabling quality of analytics in the cloud. Journal of Data and Information Quality, 9(2), 1–4.CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.George Mason UniversityArlingtonUSA