Skip to main content

Data Quality Management Past, Present, and Future: Towards a Management System for Data

  • Chapter
  • First Online:
Handbook of Data Quality

Abstract

This chapter provides a prospective look at the “big research issues” in data quality. It is based on 25 years experience, most as a practitioner; early work with a terrific team of researchers and business people at Bell Labs and AT&T; constant reflection on the meanings and methods of quality, the strange and wondrous properties of data, the importance of data and data quality in markets and companies, and the underlying reasons that some enterprises make rapid progress and others fall flat; and interactions with most of the leading companies, practitioners, and researchers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Specialized data providers serve many industries. Bloomberg, Morningstar, and Thomson-Reuters are household names in the financial services sector, for example. In these “pure data markets,” data quality is indeed front and center.

  2. 2.

    Quite obviously, government agencies and other nonprofits should not aim to “make money” from data. For them, “advance organizational mission” might be more appropriate. I’ve purposefully left “make money” in the body of text here, as I want to leave a hard edge on the point. Sooner or later, data must be recognized as equally important as capital and people (and maybe a few other) assets.

  3. 3.

    We explicitly recognize that a customer need not be a person. A computer program, an organization, and any other entity that uses data may qualify.

  4. 4.

    The late Dr. William Barnard of the Juran Institute introduced me to this notion.

  5. 5.

    And, to a lesser degree from good practice in data collection for scientific experimentation, though I know of no good reference to back up the assertion.

  6. 6.

    A (perhaps) interesting historical note: In early 90s, the team I worked on at Bell Labs struggled to come up with a good definition of data, for quality purposes, and define dimensions of data quality. We finally came up with definitions we found acceptable. And then, we realized we had completely missed the point. Our approach treated data as “static,” in a database. But static data are stunningly uninteresting. Data are interesting when they are created, moved about, morphed to suit individual needs, put to work, and combined with other data. We wrote [20] as a result and I personally think it is this team’s most important paper. At the same time, the conclusion, once stated, is obvious!

  7. 7.

    I’ve put “root causes” in quotes because a proper root cause analysis is considerably more disciplined than that conducted here.

  8. 8.

    Several comments here. First, these are by no means the only issues. See Chapter 7 of Data Driven [26] for a fuller explanation of these and many others. Second, I am not the only person to have observed such issues. See Silverman [32] and Thomas [34] for other perspectives. Third, and most importantly, I have no formal training as a social scientist. It would be enormously helpful if sociologists, anthropologists, political scientists, and others brought more sophisticated tools to bear in helping understand these issues.

  9. 9.

    Variously, Tech groups may be called Information Technology, the Chief Information Office, Information Management, Management Information Systems, etc.

  10. 10.

    Dr. Godfrey is Dean, School of Textiles, at North Carolina State University. He made the comment repeatedly as Head of the Quality Theory and Methods Department at Bell Labs and as CEO of the Juran Institute in the 1980s and 1990s. I don’t recall ever seeing it in print nor can I confirm that he was first to make the observation.

  11. 11.

    I believe this observation is due to Robert W. Pautke, Cincinnati, OH.

  12. 12.

    To be clear, I have every expectation that this list is incomplete.

  13. 13.

    Note here another reason not to share data!

  14. 14.

    Some may argue that these points merely reflect our progress in economic development and thus do not constitute a root cause at all. They have a fair point.

  15. 15.

    Earlier, I noted that there was considerable debate on the definition of “information” in our field. I think definitions should be based on entropy and/or uncertainty.

  16. 16.

    I want to be careful here. This statement is not strictly true, as entropy is a probabilistic measure.

  17. 17.

    The careful reader will object that “if data are structured,” then “unstructured data” is nonsensical. Unfortunately, those who coined the phrase appear not to have taken this into account.

  18. 18.

    Even data tracking, in many ways, the most powerful measurement technique employs a form of business rules.

  19. 19.

    Rob Hilliard and I have started a research project along these lines.

  20. 20.

    This phrase closely mirrors the moniker of Data Blueprint, “better data for better decisions.”

  21. 21.

    “Stunning” in the sense that these roles were so unexpected even a few years ago.

  22. 22.

    Some may argue that one cannot complete a “fundamental rethink” in advance of a “fundamental think.” They have a point.

  23. 23.

    See Chapter 3 of Data Driven [26] for a more complete discussion.

References

  1. Beer S (1979) The heart of enterprise. Wiley, New York

    Google Scholar 

  2. Borek A, Parlikad AL, Woodall P (2011) Towards a process for total information risk management. In: Proceedings of the 16th international conference on information quality, University of South Australia, Adelaide, 18–20 November 2011

    Google Scholar 

  3. Brackett MH (2000) Data resource quality turning bad habits into good practice. Addison-Wesley, Boston

    Google Scholar 

  4. Byrnjolfsson E, Hitt LM, Kin HH (2011) Strength in numbers: how does data-drive decision making affect firm performance? SSRN: http://ssrn.com/abstract=1819486 or http://dx.doi.org/10.2139/ssrn.1819486

  5. Carr N (2003) IT doesn’t matter. Harv Bus Rev 81(5):41–49

    Google Scholar 

  6. Chandler AD (1977) The visible hand the managerial revolution in American Business. The Belknap Press, Cambridge

    Google Scholar 

  7. Chandler AD, Cortada JW (eds) (2000) A nation transformed how information has shaped the United States from colonial times to present. Oxford University Press, England

    Google Scholar 

  8. English LP (1999) Improving data warehouse and business information quality. Wiley, New York

    Google Scholar 

  9. Eppler MJ (2003) Managing information quality. Verlag, Berlin

    Book  Google Scholar 

  10. Fisher T (2009) The data asset: how smart companies govern their data for business success. Wiley, Hoboken

    Google Scholar 

  11. Fox C, Levitin AV, Redman TC (1994) The notion of data and its quality dimensions. Inf Process Manag 30(1):9–19

    Article  Google Scholar 

  12. Greene R, Elffers J (1998) The 48 laws of power. Viking, New York

    Google Scholar 

  13. Hillard R (2010) Information-driven business: how to manage data and information for maximum advantage. Wiley, Hoboken

    Google Scholar 

  14. Huang KT, Lee YW, Wang RY (1999) Quality information and knowledge. Prentice-Hall, Upper Saddle River

    Google Scholar 

  15. Jacques E (1988) Requisite organization. Cason Hall & Company, Arlington

    Google Scholar 

  16. Juran JM, Godfrey AM (1999) Juran’s quality handbook, 5th edn, McGraw-Hill, New York

    Google Scholar 

  17. Kushner T, Villar M (2009) Managing your business data: from chaos to confidence. Racom Communications, Chicago

    Google Scholar 

  18. Laney D (2011) Infonomics: the economics of information and principles of information asset management. In: Proceedings of 5th MIT information quality industry symposium, Cambridge Massachusetts, 13–15 July 2011

    Google Scholar 

  19. Lee YW, Pipino LL, Funk JD, Wang RY, (2006) Journey to data quality. MIT Press, Cambridge

    Google Scholar 

  20. Levitin AV, Redman TC (1993) A model of data (life) cycles with applications to quality. Inf Softw Technol 35(4):217–224

    Article  Google Scholar 

  21. Levitin AV, Redman TC (1995) Quality dimensions of a conceptual view. Inf Process Manag 31(1):81–88

    Google Scholar 

  22. Loshin D (2011) The practitioner’s guide to data quality improvement. Elsevier, Amsterdam

    Google Scholar 

  23. McGilvray D (2008) Executing data quality projects ten steps to trusted data. Morgan Kaufmann, Amsterdam

    Google Scholar 

  24. Olson JE (2009) Data quality the accuracy dimension. Morgan Kaufmann, Amsterdam

    Google Scholar 

  25. Pyzdek T, Keller P (2009) The six-sigma handbook. 3rd edn. McGraw-Hill, New York

    Google Scholar 

  26. Redman TC (2008) Data driven: profiting from your most important business asset. Harv Bus Press, Boston

    Google Scholar 

  27. Redman TC (2001) Data quality: the field guide. Digital Press, Boston

    Google Scholar 

  28. Redman TC (2004) Measuring data accuracy: a framework and review. Stud Commun Sci 4(2):53–58.

    Google Scholar 

  29. Roberts DJ (2004) The modern firm. Oxford University Press, Oxford

    Google Scholar 

  30. Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423

    Article  MathSciNet  MATH  Google Scholar 

  31. Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:623–656

    Article  MathSciNet  Google Scholar 

  32. Silverman L (2006) Wake me when the data is over: how organizations use stories to drive results. Jossey-Bass, San Francisco

    Google Scholar 

  33. Talburt JR (2011) Entity resolution and information quality. Morgan Kaufmann, Amsterdam

    Google Scholar 

  34. Thomas G (2006) Alpha males and data disasters: the case for data governance. Brass Cannon, Orlando

    Google Scholar 

  35. Wang RY, Strong DM (1996) Beyond accuracy: what data quality means to data consumers, J Manag Inf Syst 12(4):5–33

    MATH  Google Scholar 

  36. Yoon Y, Aiken P, Guimaraes T, (2000) Managing organizational data resources: quality dimensions. Inf Resour Manag J 13(3):5–13

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas C. Redman .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Thomas C. Redman, Ph.D.

About this chapter

Cite this chapter

Redman, T.C. (2012). Data Quality Management Past, Present, and Future: Towards a Management System for Data. In: Sadiq, S. (eds) Handbook of Data Quality. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36257-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-36257-6_2

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-36256-9

  • Online ISBN: 978-3-642-36257-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics