Skip to main content

A Data Quality Framework for Customer Relationship Analytics

  • Conference paper
  • First Online:
Web Information Systems Engineering – WISE 2015 (WISE 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9419))

Included in the following conference series:

Abstract

Poor data quality has become an increasingly pervasive problem for organizations leading to operational inefficiency, increased costs, and missed opportunities. As high quality data is a prerequisite to trusted data analysis, we propose a framework that focuses on improving the data model to improve data quality. In particular, we show how changes to the underlying data design can achieve key data quality properties. We conduct a case study that demonstrates the application of the framework to a customer relationship management (CRM) problem. Our evaluation shows that a set of CRM queries can be efficiently run over data sizes of up to 10 million records, and organizations can glean new insights about customer preferences and activity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We refer to data quality rules and (integrity) constraints interchangeably.

  2. 2.

    The name AirWave is used to protect the organization’s identity.

  3. 3.

    The CRM queries can be found at: www.cas.mcmaster.ca/~sitaras/casestudy/.

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB, pp. 487–499 (1994)

    Google Scholar 

  2. Batini, C., Scannapieco, M.: Data Quality: Concepts, Methods and Techniques. Springer, Heidelberg (2006)

    Google Scholar 

  3. Berti-Equille, L., Dasu, T., Srivastava, D.: Discovery of complex glitch patterns: a novel approach to quantitative data cleaning. In: ICDE, pp. 733–744 (2011)

    Google Scholar 

  4. Chiang, F., Miller, R.J.: Active repair of data quality rules. In: IJIQ, pp. 174–188 (2011)

    Google Scholar 

  5. Dallachiesa, M., Ebaid, A., Eldawy, A., Elmagarmid, A., Ilyas, I.F., Ouzzani, M., Tang, N.: NADEEF: a commodity data cleaning system. In: SIGMOD, pp. 541–552 (2013)

    Google Scholar 

  6. Dasu, T., Loh, J.M.: Statistical distortion: consequences of data cleaning. PVLDB 5(11), 1674–1683 (2012)

    Google Scholar 

  7. Geerts, F., Mecca, G., Papotti, P., Santoro, D.: The LLUNATIC data-cleaning framework. PVLDB 6(9), 625–636 (2013)

    Google Scholar 

  8. Huhtala, Y., Kärkkäinen, J., Porkka, P., Toivonen, H.: Efficient discovery of functional and approximate dependencies using partitions. In: ICDE, pp. 392–401 (1998)

    Google Scholar 

  9. Judah, S., Friedman, T.: Twelve ways to improve your data quality. Gartner Research Report (2014)

    Google Scholar 

  10. Khayyat, Z., Ilyas, I., Jindal, A., Madden, S., Ouzzani, M., Papotti, P., Quiané-Ruiz, J., Tang, N., Yin, S.: Bigdansing: a system for big data cleansing. In: SIGMOD, pp. 1215–1230 (2015)

    Google Scholar 

  11. Lopes, S., Petit, J.-M., Lakhal, L.: Efficient discovery of functional dependencies and armstrong relations. In: Zaniolo, C., Grust, T., Scholl, M.H., Lockemann, P.C. (eds.) EDBT 2000. LNCS, vol. 1777, pp. 350–364. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  12. Moore, M.: Dirty data is a business problem, not an it problem. Gartner (2007)

    Google Scholar 

  13. Pei, J., Han, J.: Constrained frequent pattern mining: a pattern-growth view. SIGKDD Explor. 4(1), 31–39 (2002)

    Article  Google Scholar 

  14. Wang, X., Dong, X., Meliou, A.: Data x-ray: a diagnostic tool for data errors. In: SIGMOD, pp. 1231–1245 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fei Chiang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Chiang, F., Sitaramachandran, S. (2015). A Data Quality Framework for Customer Relationship Analytics. In: Wang, J., et al. Web Information Systems Engineering – WISE 2015. WISE 2015. Lecture Notes in Computer Science(), vol 9419. Springer, Cham. https://doi.org/10.1007/978-3-319-26187-4_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26187-4_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26186-7

  • Online ISBN: 978-3-319-26187-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics