Skip to main content

Ordinal Assessment of Data Consistency Based on Regular Expressions

  • Conference paper
  • First Online:
Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2016)

Abstract

In this paper, a novel assessment method for measurement of consistency of individual, text-valued attributes is proposed. The first novelty of this method is that it allows to express a broad range of well-known consistency measurements in a simple, elegant and standardized way. This property is obtained by relying on the standardized framework of regular expressions to support measurement. The key advantage of using such a highly standardized expression syntax, is that knowledge about consistency becomes portable, exchangeable and easy to access. The second novelty of the method, is that it examines the advantages of using a finite and ordinal scale for expression of measurement. These advantages include a high degree of interpretation and efficient calculations both in terms of time and space complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This example is fictional.

  2. 2.

    Such a paradigm is also used to determine the normal form of a relational database [5, 6].

  3. 3.

    The assignment of indices to groups is standardized and is a based on the order in which opening brackets appear in the pattern.

References

  1. Ballou, D., Pazer, H.: Modeling completeness versus consistency tradeoffs in information decision systems. IEEE Trans. Knowl. Data Eng. 15(1), 240–243 (2003)

    Article  Google Scholar 

  2. Batini, C., Cappiello, C., Francalanci, C., Maurino, A.: Methodologies for data quality assessment and improvement. ACM Comput. Surv. 41(3), 16–52 (2009)

    Article  Google Scholar 

  3. Batini, C., Scannapieco, M.: Data Quality: Concepts, Methodologies and Techniques. Data-Centric Systems and Applications. Springer, Heidelberg (2006)

    MATH  Google Scholar 

  4. Clark, P.G., Grzymala-Busse, J.W., Rzasa, W.: Consistency of incomplete data. Inf. Sci. 322, 197–222 (2015)

    Article  MathSciNet  Google Scholar 

  5. Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13(6), 377–387 (1970)

    Article  MATH  Google Scholar 

  6. Codd, E.F.: Recent investigations in relational data base systems. In: IFIP Congress, pp. 1017–1021 (1974)

    Google Scholar 

  7. Cong, G., Wenfei, F., Geerts, F., Jia, X., Ma, S.: Improving data quality: consistency and accuracy. In: Proceedings of the VLDB Conference, pp. 315–326 (2007)

    Google Scholar 

  8. Damm, M.: Total anti-symmetrische Quasigruppen. Ph.D. thesis, Philipps-Universität Marburg (2004)

    Google Scholar 

  9. Dubois, D., Prade, H.: Practical methods for constructing possibility distributions. Int. J. Intell. Syst. 31, 215–239 (2015)

    Article  Google Scholar 

  10. Even, A., Shankaranarayanan, G.: Value-driven data quality assessment. In: Proceedings of the International Conference on Information Quality, pp. 265–279 (2005)

    Google Scholar 

  11. Even, A., Shankaranarayanan, G.: Understanding impartial versus utility-driven quality assessment in large data-sets. In: Proceedings of the International Conference on Information Quality, pp. 265–279 (2007)

    Google Scholar 

  12. Even, A., Shankaranarayanan, G.: Utility-driven assessment of data quality. DATA BASE Adv. Inf. Syst. 38(2), 75–93 (2007)

    Article  Google Scholar 

  13. FĂ¼rber, C., Hepp, M.: Towards a vocabulary for data quality management in semantic web architectures. In: Proceedings of the 1st International Workshop on Linked Web Data Management (LWDM2011), pp. 265–279 (2011)

    Google Scholar 

  14. Heinrich, B., Kaiser, M., Klier, M.: How to measure data quality? A metric based approach. In: Proceedings of the International Conference on Information Systems, pp. 1–15 (2007)

    Google Scholar 

  15. Heinrich, B., Kaiser, M., Klier, M.: Does the EU insurance mediation directive help to improve data quality? A metric-based analysis. In: European Conference on Information Systems, pp. 1871–1882 (2008)

    Google Scholar 

  16. Heinrich, B., Klier, M.: Metric-based data quality assessment - developing and evaluation a probability-based currency metric. Decis. Supp. Syst. 72, 82–96 (2015)

    Article  Google Scholar 

  17. Heinrich, B., Klier, M., Kaiser, M.: A procedure to develop metrics for currency and its application in CRM. ACM J. Data Inf. Qual. 1(1), 5:1–5:28 (2009)

    Google Scholar 

  18. IEEE: ISO/IEC/IEEE 9945: 2009 information technology portable operating system interface (posix) base specifications, issue 7 (2009)

    Google Scholar 

  19. Krantz, D., Luce, D., Suppes, P., Tversky, A.: Foundations of Measurement: Additive and Polynomial Representations, vol. I. Academic Press, New York (1971)

    MATH  Google Scholar 

  20. Luhn, H.P.: Computer for verifying numbers , US Patent 2,950,048 (1960)

    Google Scholar 

  21. Pipino, L., Lee, Y., Wang, R.: Data quality assessment. Commun. ACM 45(4), 211–218 (2002)

    Article  Google Scholar 

  22. Redman, T.: Data Quality for the Information Age. Artech-House, Boston (1996)

    Google Scholar 

  23. Wang, R., Storey, V., Firth, C.: A framework for analysis of data quality research. IEEE Trans. Knowl. Data Eng. 7(4), 623–640 (1995)

    Article  Google Scholar 

  24. Wang, R., Strong, D.: Beyond accuracy: what data quality means to data consumers. J. Manage. Inf. Syst. 12(4), 5–34 (1996)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antoon Bronselaer .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Bronselaer, A., Nielandt, J., De Mol, R., De Tré, G. (2016). Ordinal Assessment of Data Consistency Based on Regular Expressions. In: Carvalho, J., Lesot, MJ., Kaymak, U., Vieira, S., Bouchon-Meunier, B., Yager, R. (eds) Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2016. Communications in Computer and Information Science, vol 611. Springer, Cham. https://doi.org/10.1007/978-3-319-40581-0_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-40581-0_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-40580-3

  • Online ISBN: 978-3-319-40581-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics