Skip to main content

Noise Removal Framework for Market Basket Analysis

  • Conference paper
  • First Online:
Book cover Smart Trends in Information Technology and Computer Communications (SmartCom 2017)

Abstract

Rapid development in data analysis domain causes an ever growing demand for Market Basket Analysis. However, predefined methods in this domain emphasize on different techniques which concentrate to select appropriate items. In this paper, we tried to develop a framework for cleaning the dataset that depends on the proposition that “Better noise removal brings out better data analysis”. Eliminating noisy objects is an essential goal of data preprocessing as noise hampers data analysis. Data cleaning techniques which are recently developed concentrates on noise removals that are the consequences of low-level data errors. It causes due to defective data gathering process, but data objects that are clearly connected or related only at some particular time or unrelated/unimportant can also be significantly interfere with data analysis. Thus, in order to improve the data analysis to a greater extent, noisy data with respect to the underlying analysis must be removed at data preprocessing which is one of the steps of Knowledge Discovery in Databases (KDD). Hence to remove all types of noise, there is a need of data cleaning strategies. Because data sets can contain enormous measures of noise, these methods also need to be able to remove extensive portion of the data. To augment data analysis in existence of high noise intensity, this paper find method meant for noise removal.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Xiong, H.: Enhancing data analysis with noise removal. IEEE Trans. Knowl. Data Eng. 18, 304–319 (2010)

    Article  Google Scholar 

  2. Gangurde, R., Kumar, B., Gore, S.D.: Building prediction model using market basket analysis. Int. J. Innov. Res. Comput. Commun. Eng. 5(2) (2017)

    Google Scholar 

  3. Jacob, M., Kuscher, A.: Automated data augmentation services using text mining, data clean singand web crawling techniques. IEEE Congr. Serv. 136–143 (2008)

    Google Scholar 

  4. Zhu, W.-H.: Data extraction and cleansing of semi-structured Chinese texts, Department of Computer Science. Jinan University, Guangzhou (2010)

    Google Scholar 

  5. Mohamed, H.H., Kheng, T.L., Collin, C.: E-clean: a data cleaning framework for patient data. In: IEEE International, pp. 1093–1098. IEEE (2011)

    Google Scholar 

  6. Prasad, K.H., Faruquie, T.A., Joshi, S.: Data cleansing techniques for large enterprise datasets. In: IBM Research-India (2011)

    Google Scholar 

  7. Mateos, G.: Load curve data cleansing and imputation via sparsity and low rank. IEEE Trans. Smartgrid 4(4), 2347–2355 (2013)

    Google Scholar 

  8. Høverstad, B.A.: Effects of data cleansing on load prediction algorithms, Department of Computer and Information Science. Norwegian University of Science and Technology, Trondheim, Norway (2011)

    Google Scholar 

  9. Ku, W.-S.: A Bayesian inference-based framework for RFID data cleansing. IEEE Trans. Knowl. Data Eng. 25(10), 2177–2191 (2013)

    Article  Google Scholar 

  10. Huang, A.: System light-loading technology for mhealth: manifold-learning-based medical data cleansing and clinical trials in WE-CARE project. IEEE J. Biomed. Health Inf. 18(5), 1581–1589 (2014)

    Article  Google Scholar 

  11. Rahm, E.: Data cleaning: problems and current approaches. University of Leipzig, Germany (2014)

    Google Scholar 

  12. Redman, T.: The impact of poor data quality on the typical enterprise. Commun. ACM 2, 79–82 (2003)

    Google Scholar 

  13. Eckerson, W.: Data quality and the bottom line: achieving business success through a commitment to high quality data. Technical report, The Data Warehousing Institute (2002)

    Google Scholar 

  14. Chen, J., Li, W., Lau, A., Cao, J., Wang, K.: Automated load curve data cleansing in powers ystems. IEEE Trans. Smartgrid 1, 213–221 (2010)

    Google Scholar 

  15. Raman, V., Hellerstein, J.M.: Potter’s wheel: an interactive framework for data transformation and cleaning. In: Proceedings of the 27th VLDB Conference, Roma, Italy (2001)

    Google Scholar 

  16. Maletic, J.I., Marcus, A.: Data cleansing: beyond integrity analysis. In: Proceedings of the Conference on Information Quality (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roshan Gangurde .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gangurde, R., Kumar, B., Gore, S.D. (2018). Noise Removal Framework for Market Basket Analysis. In: Deshpande, A., et al. Smart Trends in Information Technology and Computer Communications. SmartCom 2017. Communications in Computer and Information Science, vol 876. Springer, Singapore. https://doi.org/10.1007/978-981-13-1423-0_24

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-1423-0_24

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-1422-3

  • Online ISBN: 978-981-13-1423-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics