Abstract
When creating software components that aim to alleviate information quality problems, it is necessary to elicit the requirements that the problem holders have, as well as the details of the existing technical infrastructure that will form the basis of the solution. In the literature, standard sets of IQ dimensions have been proposed as a means of initiating and structuring the information gathering and design processes involved.
Over the past decade, we have been involved in several projects to develop IQ assessment components. In the earlier projects, we tried hard to make use of the standard IQ dimensions in this way, but found that we derived little benefit from this approach. In some cases, the IQ problem we were focussed on could not be assigned cleanly to one dimension or another. In others, the dimension was clear, but we found that that knowledge saved us very little of the work we had to do when the dimension was not identified up front.
However, IQ problems are typically very challenging, and some sort of guiding principles are needed. In this paper, we propose our earlier notion of the Quality View (QV) as an alternative (or additional) technique to IQ dimensions for developing IQ management components. We reflect on our experiences in using QVs in three quite different IQ-related projects, and show how our initial basic pattern turned out to be a good starting point for the information gathering and design tasks involved, replacing IQ dimensions in the role originally envisaged for them.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The concept is mentioned, but not defined, by Pipino et al. (2002).
References
Batini, C., & Scannapieco, M. (2006). Data quality: Concepts, methodologies and techniques. Berlin: Springer.
Burgoon, L., Eckel-Passow, J., Gennings, C., Boverhof, D., Burt, J., Fong, C., & Zacharewski, T. (2005). Protocols for the assurance of microarray data quality and process control. Nucleic Acids Research, 33(19), e172.
Elmagarmid, A., Ipeirotis, P., & Verykios, V. (2007). Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering, 19(1), 1–16.
Emran, N. (2011). Definition and analysis of population-based data completeness measurement. Ph.D. thesis, The University of Manchester.
Emran, N., Embury, S., & Missier, P. (2008). Model-driven component generation for families of completeness. In P. Missier, X. Lin, A. de Keijzer, & M. van Keulen (Eds.), Proceedings of the international workshop on quality in databases and management of uncertain data, Auckland, New Zealand, pp. 123–132.
Emran, N., Embury, S., Missier, P., Mat Isa, M., & Kamilah Muda, A. (2013). Measuring data completeness for a microbial genomics database. In A. Selamat et al. (Eds.), Proceedings of 5th Asian Conference on Intelligent information and database systems (ACIIDS’13) (LNAI 7802, pp. 186–195). Kuala Lumpur: Springer.
English, L. (1999). Improving data warehouse and business information quality. New York: Wiley.
English, L. (2009). Information quality applied: Best practices for improving business information, processes and systems. Indianapolis: Wiley.
Eppler, M., & Muenzenmayer, P. (2002). Measuring information quality in the web context: A survey of state-of-the-art instruments and an application methodology. In C. Fisher & B. Davidson (Eds.), Proceedings of 7th international conference on information quality (IQ 2002) (pp. 187–196). Cambridge, MA, USA: MIT.
Fox, C., Levitin, A., & Redman, T. (1994). The notion of data and its quality dimensions. Information Processing and Management, 30(1), 9–19.
Hedeler, C., Embury, S. M., & Paton, N. W. (2013). The role of reference data sets in data deduplication. In preparation.
Hull, D., Wolstencroft, K., Stevens, R., Goble, C., Pocock, M., Li, P., & Oinn, T. (2006). Taverna: A tool for building and running workflows of services. Nucleic Acids Research, 34(Web Server issue), W729–W732.
Illari, P., & Floridi, L. (2012). IQ: Purpose and dimensions. In 17th International Conference on Information quality (ICIQ 2012), Paris.
Lee, Y., Strong, D., Kahn, B., & Wang, R. (2002). AIMQ: A methodology for information quality assessment. Information & Management, 40(2), 133–146.
Loshin, D. (2004). Enterprise knowledge management – The data quality approach (Series in Data management). Morgan Kaufmann.
McGilvray, D. (2008). Executing data quality projects: Ten steps to quality data and trusted information. Amsterdam/Boston: Morgan Kaufmann.
Medjahed, B., Bouguettaya, A., & Elmagarmid, A. (2003). Composing web services on the semantic web. VLDB Journal, 12(4), 333–351.
Missier, P., Embury, S., Greenwood, R., Preece, A., & Jin, B. (2006). Quality views: Capturing and exploiting the user perspective on data quality. In U. Dayal et al. (Eds.), Proceedings of the 32nd International Conference on Very large data bases (VLDB’06) (pp. 977–988). Seoul: ACM Press.
Missier, P., Embury, S., Hedeler, C., Greenwood, M., Pennock, J., & Brass, A. (2007). Accelerating disease gene identification through integrated SNP data analysis. In Proceedings of the 4th International Workshop on Data integration in the life sciences (pp. 215–230). Springer.
Motro, A., & Rakov, I. (1998). Estimating the quality of databases. In Proceedings of the Third International Conference on Flexible query answering systems (FQAS’98) (pp. 298–307). Springer-Verlag.
Pipino, L. L., Lee, Y. W., & Wang, R. Y. (2002). Data quality assessment. Communications of the ACM, 45(4), 211–218.
Preece, A., Jin, B., Missier, P., Embury, S., Stead, D., & Brown, A. (2006a). Towards the management of information quality in proteomics. In Proceedings of 19th IEEE International Symposium on Computer-based medical systems (CBMS’06) (pp. 936–940). Salt Lake City: IEEE Computer Society Press.
Preece, A., Jin, B., Pignotti, E., Missier, P., & Embury, S. (2006b). Managing information quality in e-science using semantic web technology. In Proceedings of 3rd European Semantic web conference (ESWC06) (LNCS 4011, pp. 472–486). Springer.
Stead, D., Preece, A., & Brown, A. (2006). Universal metrics for quality assessment of protein identifications by mass spectrometry. Molecular and Cell Proteomics, 5(7), 1205–1211.
Wang, R., & Strong, D. (1996). Beyond accuracy: What data quality means to data consumers. Journal of Management Information Systems, 12(4), 5–34.
Acknowledgements
The work reported in this paper on Quality Views was supported by a grant from the EPSRC. The opinions of the authors have been greatly improved by discussions with their colleagues on the Qurator project team, in the Information Management Group at Manchester and the Informatics research group at the University of Cardiff.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Embury, S.M., Missier, P. (2014). Forget Dimensions: Define Your Information Quality Using Quality View Patterns. In: Floridi, L., Illari, P. (eds) The Philosophy of Information Quality. Synthese Library, vol 358. Springer, Cham. https://doi.org/10.1007/978-3-319-07121-3_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-07121-3_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07120-6
Online ISBN: 978-3-319-07121-3
eBook Packages: Humanities, Social Sciences and LawPhilosophy and Religion (R0)