An Indicator Function for Insufficient Data Quality – A Contribution to Data Accuracy
- 796 Downloads
Abstract
Owing to the fact that insufficient data quality usually leads to wrong decisions and high costs, managing data quality is a prerequisite for the successful execution of business and decision processes. An economics-driven management of data quality is in need of efficient measurement procedures, which allow for a predominantly automated identification of poor data quality. Against this background the paper investigates how metrics for the DQ dimensions completeness, validity, and currency can be aggregated to derive an indicator for accuracy. Therefore existing approaches to measure these dimensions are analyzed in order to make explicit, which metric addresses which aspect of data quality. Based on this analysis, an indicator function is designed returning a measure for accuracy on different levels of a data resource. The indicator function’s applicability is demonstrated using a customer database example.
Keywords
Data quality data quality management measurement accuracyPreview
Unable to display preview. Download preview PDF.
References
- 1.Ballou, D.P., Pazer, H.L.: Modeling completeness versus consistency tradeoffs in information decision contexts. IEEE Trans. Knowled. Data Eng. 1, 240–243 (2003)Google Scholar
- 2.Ballou, D.P., Pazer, H.L.: Designing information systems to optimize the accuracy-timeliness tradeoff. Information Systems Research 1, 51–72 (1995)CrossRefGoogle Scholar
- 3.Ballou, D.P., Tayi, G.K.: Enhancing Data Quality in Data Warehouse Environments. Communications of the ACM 1, 73–78 (1999)CrossRefGoogle Scholar
- 4.Ballou, D.P., Wang, R.Y., Pazer, H.L., Tayi, G.K.: Modeling Information Manufacturing Systems to Determine Information Product Quality. Management Science 4, 462–484 (1998)CrossRefGoogle Scholar
- 5.Batini, C., Barone, D., Cabitza, F., Grega, S.: A Data Quality Methodology for Heterogenous Data. International Journal of Database Management Systems 1, 60–79 (2011)Google Scholar
- 6.Batini, C., Scannapieco, M.: Data Quality. Concepts, Methodologies and Techniques (Data-Centric Systems and Applications), vol. 1, Berlin (2006)Google Scholar
- 7.Blake, R., Mangiameli, P.: The Effects and Interactions of Data Quality and Problem Complexity on Classification. Journal of Data and Information Quality (JDIQ) 2, 8 (2011)Google Scholar
- 8.Calero, C., Caro, A., Piattini, M.: An applicable data quality model for web portal data consumers. World Wide Web 4, 465–484 (2008)CrossRefGoogle Scholar
- 9.Cappiello, C., Comuzzi, M.: A Utility-Based Model to Define the Optimal Data Quality Level in IT Service Offering. In: Proceedings of the 17th European Conference on Information Systems (ECIS), Verona (Italy), pp. 1062–1074 (2009)Google Scholar
- 10.Cappiello, C., Francalanci, C., Pernici, B.: Time-Related Factors of Data Quality in Multichannel Information Systems. Journal of Management Information Systems 3, 71–91 (2004)Google Scholar
- 11.Caro, A., Calero, C., Piattini, M.: Development Process of the Operational Version of PDQM. In: Benatallah, B., Casati, F., Georgakopoulos, D., Bartolini, C., Sadiq, W., Godart, C. (eds.) WISE 2007. LNCS, vol. 4831, pp. 436–448. Springer, Heidelberg (2007)CrossRefGoogle Scholar
- 12.Codd, E.F.: Extending the database relational model to capture more meaning. ACM Transactions on Database Systems (TODS) 4, 397–434 (1979)CrossRefGoogle Scholar
- 13.CSO Insights: 2005 Executive Report: Target Marketing Priorities Analysis (2005) Google Scholar
- 14.De Amicis, F., Barone, D., Batini, C.: An analytical framework to analyze dependencies among data quality dimensions. In: Proceedings of the 11th International Conference on Information Quality (ICIQ), Cambridge, MA (USA), pp. 369–383 (2006)Google Scholar
- 15.Eppler, M.J.: Managing information quality, vol. 1, Berlin (2003)Google Scholar
- 16.Even, A., Shankaranarayanan, G.: Utility-Driven Assessment of Data Quality. The DATA BASE for Advances in Information Systems 2, 75–93 (2007)CrossRefGoogle Scholar
- 17.Even, A., Shankaranarayanan, G.: Value-driven data quality assessment. In: Proceedings of the 10th International Conference on Information Quality (ICIQ), pp. 221–236. MIT Press, Cambridge (2005)Google Scholar
- 18.Even, A., Shankaranarayanan, G., Berger, P.D.: Economics-Driven Data Management: An Application to the Design of Tabular Datasets. IEEE Transactions on Knowledge and Data Engineering 6, 818–831 (2007)CrossRefGoogle Scholar
- 19.Fisher, C.W., Chengalur-Smith, I.N., Ballou, D.P.: The Impact of Experience and Time on the Use of Data Quality Information in Decision Making. Information Systems Research 2, 170–188 (2003)CrossRefGoogle Scholar
- 20.Fox, C., Levitin, A., Redman, T.C.: The Notion of Data and Its Quality Dimensions. Information Processing & Management 1, 9–19 (1994)CrossRefGoogle Scholar
- 21.Gackowski, Z.J.: Logical interdependence of data/information quality dimensions—A purpose-focused view on IQ. In: Proceedings of the Ninth International Conference on Information Quality (ICIQ 2004), Cambridge, MA, USA (2004)Google Scholar
- 22.Görz, Q.: An Economics-Driven Decision Model for Data Quality Improvement – A Contribution to Data Currency. In: Proceedings of the 17th Americas Conference on Information Systems (AMCIS), Detroit, Michigan, USA (2011)Google Scholar
- 23.Information Workers Beware: Your Business Data Can’t Be Trusted, http://www.sap.com/about/newsroom/businessobjects/20060625_005028.epx
- 24.Heinrich, B., Kaiser, M., Klier, M.: A Procedure to Develop Metrics For Currency and its Application in CRM. ACM Journal of Data and Information Quality 1, 5:1–5:28 (2009) Google Scholar
- 25.Heinrich, B., Kaiser, M., Klier, M.: Does the EU Insurance Mediation Directive help to improve Data Quality? - A metric-based analysis. In: Proceedings of the 16th European Conference on Information Systems (ECIS), Galway, Irland (2008)Google Scholar
- 26.Heinrich, B., Kaiser, M., Klier, M.: How to measure data quality? – a metric based approach. In: Proceedings of the 28th International Conference on Information Systems (ICIS), Montreal, Canada (2007)Google Scholar
- 27.Heinrich, B., Kaiser, M., Klier, M.: Metrics for measuring data quality – Foundations for an economic data quality management. In: 2nd International Conference on Software and Data Technologies (ICSOFT), Barcelona, Spain (2007)Google Scholar
- 28.Helfert, M., Foley, O., Ge, M., Cappiello, C.: Limitations of Weighted Sum Measures for Information Quality. In: Proceedings of the 15th Americas Conference on Information Systems (AMCIS), San Francisco, CA, USA (2009)Google Scholar
- 29.Juran, J.M.: How to think about Quality, New York, vol. 5, pp. 2.1–2.18 (1998) Google Scholar
- 30.Kahn, B.K., Strong, D.M., Wang, R.Y.: Information quality benchmarks: product and service performance. Commun. ACM 4, 184–192 (2002)CrossRefGoogle Scholar
- 31.Lee, Y.W., Pipino, L., Strong, D.M., Wang, R.Y.: Process-Embedded Data Integrity. Journal of Database Management 1, 87–103 (2004)CrossRefGoogle Scholar
- 32.Lee, Y.W., Strong, D.M., Kahn, B.K., Wang, R.Y.: AIMQ: a methodology for information quality assessment. Information & Management 2, 133–146 (2002)CrossRefGoogle Scholar
- 33.Naumann, F., Freytag, J., Leser, U.: Completeness of Integrated Information Sources. Information Systems 7, 583–615 (2004)CrossRefGoogle Scholar
- 34.Orr, K.: Data Quality and Systems Theory. Communications of the ACM 2, 66–71 (1998)CrossRefGoogle Scholar
- 35.Otto, B., Lee, Y.W., Caballero, I.: Information and data quality in business networking: a key concept for enterprises in its early stages of development. Electronic Markets, 83–97 (2011)Google Scholar
- 36.Parssian, A., Sarkar, S., Jacob, V.S.: Assessing Data Quality for Information Products: Impact of Selection, Projection, and Cartesian Product. Management Science 7, 967–982 (2004)CrossRefGoogle Scholar
- 37.Pipino, L., Lee, Y.W., Wang, R.Y.: Data Quality Assessment. Communications of the ACM 4, 211–218 (2002)CrossRefGoogle Scholar
- 38.Russom, P.: Taking Data Quality to the Enterprise through Data Governance. The Data Warehousing Institute, Seattle (2006)Google Scholar
- 39.Vassiliou, Y.: Null values in data base management - a denotational semantics approach. In: Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data (SIGMOD 1979), pp. 162–169. ACM, Boston (1979)CrossRefGoogle Scholar
- 40.Wand, Y., Wang, R.Y.: Anchoring data quality dimensions in ontological foundations. Communications of the ACM 11, 86–95 (1996)CrossRefGoogle Scholar
- 41.Wang, R.Y.: A Product Perspective on Total Data Quality Management. Communications of the ACM 2, 58–65 (1998)CrossRefGoogle Scholar
- 42.Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. Journal of Management Information Systems 4, 5–33 (1996)Google Scholar