Abstract
Rapid development in data analysis domain causes an ever growing demand for Market Basket Analysis. However, predefined methods in this domain emphasize on different techniques which concentrate to select appropriate items. In this paper, we tried to develop a framework for cleaning the dataset that depends on the proposition that “Better noise removal brings out better data analysis”. Eliminating noisy objects is an essential goal of data preprocessing as noise hampers data analysis. Data cleaning techniques which are recently developed concentrates on noise removals that are the consequences of low-level data errors. It causes due to defective data gathering process, but data objects that are clearly connected or related only at some particular time or unrelated/unimportant can also be significantly interfere with data analysis. Thus, in order to improve the data analysis to a greater extent, noisy data with respect to the underlying analysis must be removed at data preprocessing which is one of the steps of Knowledge Discovery in Databases (KDD). Hence to remove all types of noise, there is a need of data cleaning strategies. Because data sets can contain enormous measures of noise, these methods also need to be able to remove extensive portion of the data. To augment data analysis in existence of high noise intensity, this paper find method meant for noise removal.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Xiong, H.: Enhancing data analysis with noise removal. IEEE Trans. Knowl. Data Eng. 18, 304–319 (2010)
Gangurde, R., Kumar, B., Gore, S.D.: Building prediction model using market basket analysis. Int. J. Innov. Res. Comput. Commun. Eng. 5(2) (2017)
Jacob, M., Kuscher, A.: Automated data augmentation services using text mining, data clean singand web crawling techniques. IEEE Congr. Serv. 136–143 (2008)
Zhu, W.-H.: Data extraction and cleansing of semi-structured Chinese texts, Department of Computer Science. Jinan University, Guangzhou (2010)
Mohamed, H.H., Kheng, T.L., Collin, C.: E-clean: a data cleaning framework for patient data. In: IEEE International, pp. 1093–1098. IEEE (2011)
Prasad, K.H., Faruquie, T.A., Joshi, S.: Data cleansing techniques for large enterprise datasets. In: IBM Research-India (2011)
Mateos, G.: Load curve data cleansing and imputation via sparsity and low rank. IEEE Trans. Smartgrid 4(4), 2347–2355 (2013)
Høverstad, B.A.: Effects of data cleansing on load prediction algorithms, Department of Computer and Information Science. Norwegian University of Science and Technology, Trondheim, Norway (2011)
Ku, W.-S.: A Bayesian inference-based framework for RFID data cleansing. IEEE Trans. Knowl. Data Eng. 25(10), 2177–2191 (2013)
Huang, A.: System light-loading technology for mhealth: manifold-learning-based medical data cleansing and clinical trials in WE-CARE project. IEEE J. Biomed. Health Inf. 18(5), 1581–1589 (2014)
Rahm, E.: Data cleaning: problems and current approaches. University of Leipzig, Germany (2014)
Redman, T.: The impact of poor data quality on the typical enterprise. Commun. ACM 2, 79–82 (2003)
Eckerson, W.: Data quality and the bottom line: achieving business success through a commitment to high quality data. Technical report, The Data Warehousing Institute (2002)
Chen, J., Li, W., Lau, A., Cao, J., Wang, K.: Automated load curve data cleansing in powers ystems. IEEE Trans. Smartgrid 1, 213–221 (2010)
Raman, V., Hellerstein, J.M.: Potter’s wheel: an interactive framework for data transformation and cleaning. In: Proceedings of the 27th VLDB Conference, Roma, Italy (2001)
Maletic, J.I., Marcus, A.: Data cleansing: beyond integrity analysis. In: Proceedings of the Conference on Information Quality (2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Gangurde, R., Kumar, B., Gore, S.D. (2018). Noise Removal Framework for Market Basket Analysis. In: Deshpande, A., et al. Smart Trends in Information Technology and Computer Communications. SmartCom 2017. Communications in Computer and Information Science, vol 876. Springer, Singapore. https://doi.org/10.1007/978-981-13-1423-0_24
Download citation
DOI: https://doi.org/10.1007/978-981-13-1423-0_24
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1422-3
Online ISBN: 978-981-13-1423-0
eBook Packages: Computer ScienceComputer Science (R0)