Abstract
High-quality data is very important for data analysis and mining. Data quality can be indicated by many indicators, and some methods have been proposed for data quality improvement by improving one or more data quality indicators. However, there is few work to discuss the impact of the processing order of data quality indicators on the overall data quality. In this paper, first, some data quality indicators and their improvement methods are given; second, the impact of the processing order of data quality indicators on the overall data quality is discussed, and then a novel data quality improvement method based on the greedy algorithm is proposed. Experiments have been shown that the proposed method can improves the data quality while reducing the time and computational costs.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Li, Cai, Yu, L., Zhu, Y., et al.: Historical evolution and development trend of data quality. Comput. Sci. 45(4), 1–10 (2018)
Saha, B., Srivastava, D.: Data quality: the other face of big data. In: IEEE International Conference on Data Engineering. IEEE (2014)
Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manag. Inf. Syst. 12(4), 5–33 (1996)
Sidi, F., Panahy, P.H.S., Affendey, L.S., et al.: Data quality: a survey of data quality dimensions. In: International Conference on Information Retrieval & Knowledge Management (2012)
Zaveri, A., Rula, A., Maurino, A., et al.: Quality assessment for linked data: a survey. Semant. Web 7(1), 63–93 (2015)
Wang, Z., Yang, Q.: Research on the quality and standardization of scientific data. Stand. Sci. 03, 25–30 (2019)
Mohan, Li, Li, J., Gao, H.: Solution algorithm for data timeliness determination. J. Comput. Sci. 35(11), 2348–2360 (2012)
Fan, W., Geerts, F.: Relative information completeness. ACM Trans. Database Syst. (TODS) 35(4), 1–44 (2010)
Fan, W., Li, J., Ma, S., et al.: Interaction between record matching and data repairing. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, 12–16 June 2011, Athens, Greece. ACM (2011)
Fan, W., Ma, S., Tang, N., Yu, W.: Interaction between record matching and data repairing. J. Data Inf. Qual. (JDIQ) 4(4), 16 (2014)
Quercia, D., Hogan, B.: Proceedings of the Ninth International AAAI Conference on Web and Social Media - ICWSM 2015. AAAI Press (2015)
Ding, X., Wang, H., Zhang, X., et al.: Research on the relationship among various properties of data quality. J. Softw. 27(7), 1626–1644 (2016)
Cheng, H., Feng, D., Shi, X., et al.: Data quality analysis and cleaning strategy for wireless sensor networks. Eurasip J. Wirel. Commun. Netw. 2018(1), 61 (2018)
Kleindienst, D.: The data quality improvement plan: deciding on choice and order of data quality improvements. Electron. Markets 27(4), 1–12 (2017)
Helfert, M., Foley, O., Ge, M., et al.: Limitations of Weighted Sum Measures for Information Quality (2009)
Batini, C., Cappiello, C., Francalanci, C., et al.: Methodologies for data quality assessment and improvement. ACM Comput. Surv. 41(3), 16 (2009)
Zhao, W., Li, C.: A review of the research on quality evaluation methods of associated data. Intell. Theory Practice 39(02), 134–138+128 (2016)
Liu, H.: Analysis of statistical data quality. In: International Joint Conference on Computational Sciences & Optimization. IEEE (2014)
Alpar, P., Winkelsträter, S.: Assessment of data quality accounting data with association rules. Expert Syst. Appl. 41(5), 2259–2268 (2014)
Vaziri, R., Mohsenzadeh, M., Habibi, J.: Measuring data quality with weighted metrics. Total Qual. Manag. Bus. Excellence 30(5–6), 708–720 (2019)
Acknowledgments
This work was supported by the State Grid Corporation Science and Technology Project (Contract No.: SGLNXT00YJJS1800110).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Wang, Z., Fu, Y., Song, C., Ge, W., Qiao, L., Zhang, H. (2019). A Data Quality Improvement Method Based on the Greedy Algorithm. In: Zhai, X., Chen, B., Zhu, K. (eds) Machine Learning and Intelligent Communications. MLICOM 2019. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 294. Springer, Cham. https://doi.org/10.1007/978-3-030-32388-2_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-32388-2_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32387-5
Online ISBN: 978-3-030-32388-2
eBook Packages: Computer ScienceComputer Science (R0)