Analyzing Data Quality Trade-Offs in Data-Redundant Systems

  • C. Cappiello
  • M. Helfert


For technical and architectural reasons data in information systems are often redundant in various databases. Data changes are propagated between the various databases through a synchronization mechanism, which ensures a certain degree of consistency. Depending on the time delay of propagating data changes, synchronization is classified in real time synchronization and lazy synchronization in case of respectively high or low synchronization frequency. In practice, lazy synchronization is very commonly applied but, because of the delay in data synchronization, it causes misalignments among data values resulting in a negative impact on data quality. Indeed, the raise of the time interval between two realignments increases the probability that data result incorrect or out-of-date. The paper analyses the correlation between data quality criteria and the synchronization frequency and reveals the presence of trade-offs between different criteria such as availability and timeliness. The results illustrate the problem of balancing various data quality requirements within the design of information systems. The problem is examined in selected types of information systems that are in general characterized by high degree of data redundancy.


Data Quality Data Warehouse Synchronization Mechanism Synchronization Frequency Query Response Time 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Pacitti, E. and Simon, E. (2000). Update propagation strategies to improve freshness in lazy master replicated databases. VLDB Journal 8 (3-4): 305-318.CrossRefGoogle Scholar
  2. 2.
    Orr, K. (1998). Data quality and systems theory. Communications of the ACM 41 (2): 66-71.CrossRefGoogle Scholar
  3. 3.
    Wand, Y. and Wang, R.Y. (1996). Anchoring data quality dimensions in ontological founda-tions. Communication of the ACM 39 (11): 86-95.CrossRefGoogle Scholar
  4. 4.
    Cappiello, C. , Francalanci, C. , and Pernici, B. (Winter 2003-2004). Time-related factors of data quality in multichannel information systems. Journal of Management Information Systems, 20. (3): 71-91.Google Scholar
  5. 5.
    Jarke, M., Lenzerini, Vassiliou, Y. , and Vassiliadis, P. (1999). Fundamentals of Data Ware-houses. Springer, Berlin.Google Scholar
  6. 6.
    Barbara, D. and Garcia-Molina, D. (1981). The cost of data replication. In Proceedings of the Seventh Data Communications Symposium, Mexico, pp. 193-198.Google Scholar
  7. 7.
    Collins, K. (1999). Data: Evaluating value vs. cost. Tactical Guidelines, TG-08-3321. Gart-ner Group.Google Scholar

Copyright information

© Physica-Verlag Heidelberg 2008

Authors and Affiliations

  • C. Cappiello
    • 1
  • M. Helfert
    • 2
  1. 1.Politecnico di MilanoMilanoItaly
  2. 2.Dublin City UniversityDublinIreland

Personalised recommendations