Abstract
Data analytics is an ever-growing field which provides insights, predictions and patterns from raw data. The outcome of analytics is greatly affected by the quality of input data on which the analytics is done. This paper explores the design of a quality-aware data capture system, which uses Data Mining Techniques and algorithms, specifically a decision-tree based approach for data validation and verification, with an objective of identifying data quality issues right at a stage when data enters the system by providing appropriate feedback through a carefully designed user-interface.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Berti-Équille, L.: Quality awareness for managing and mining data. Doctoral dissertation, Université de Rennes 1 (2007)
Dasu, T., Johnson, T.: Exploratory Data Mining and Data Cleaning. Wiley, New York (2003)
Dasu T., Johnson T., Muthukrishnan S., Shkapenyuk V.: Mining database structure; or, how to build a data quality browser. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 240–251. ACM (2002)
Jiawei, H., Micheline, K.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2006)
Hall, M.: A decision tree-based attribute weighting filter for naive Bayes. Knowl.-Based Syst. 20(2), 120–126 (2007)
Acknowledgements
This work was supported by a grant from Evive Software Analytics Pvt. Ltd.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Mehta, R.V.K., Verma, S. (2017). Design of a Quality-Aware Data Capture System. In: Tan, Y., Takagi, H., Shi, Y. (eds) Data Mining and Big Data. DMBD 2017. Lecture Notes in Computer Science(), vol 10387. Springer, Cham. https://doi.org/10.1007/978-3-319-61845-6_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-61845-6_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61844-9
Online ISBN: 978-3-319-61845-6
eBook Packages: Computer ScienceComputer Science (R0)