Skip to main content

Conceptual Machine Learning Framework for Initial Data Analysis

  • Conference paper
  • First Online:
Computing and Network Sustainability

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 75))

Abstract

This century has witnessed the emergence of new branch of science—data science that facilitates the analysis of large amounts of data which in turn helps in taking model-based data-driven decisions. The prelude to any successful analytical model building and implementation phase is a properly conducted initial data analysis stage. IDA encompasses laborious tasks of data cleansing: missing value treatment, outlier detection, checking the veracity of data, data transformation, and thus preparing data for model building. A systematic, disciplined, and non-personalized approach to IDA reduces the probability of incorrect and inaccurate results from the model. The amount of data presented for model building today makes the IDA stage a very crucial task which cannot be manually conducted. Machine learning can be applied to analyze complex and bigger data, find patterns accurately, etc. Hence, it could also be used for data preparation prior to model building. This paper tries to reduce the ad hoc nature of IDA by providing a conceptual framework using machine learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Dwivedi SK, Rawat B (2015) A review paper on data preprocessing: a critical phase in web usage mining process. In: 2015 International conference on green computing and internet of things (ICGCIoT), Noida, 2015, pp 506–510

    Google Scholar 

  2. Pendharkar PC (2005) A data envelopment analysis-based approach for data preprocessing. IEEE Trans Knowl Data Eng 17(10):1379–1388. https://doi.org/10.1109/tkde.2005.155

    Article  Google Scholar 

  3. Suneetha KR, Krishnamoorthi Dr R (2009) Data preprocessing and easy access retrieval of data through data ware house. In: Proceedings of the World congress on engineering and computer science 2009, vol I WCECS 2009, October 20–22, 2009, San Francisco, USA, 978-988-17012-6-8

    Google Scholar 

  4. Sudheer Reddy K, Kantha Reddy M, Sitaramulu V (2013) An effective data preprocessing method for Web Usage Mining. In: 2013 International conference on information communication and embedded systems (ICICES), Chennai, 2013, pp 7–10

    Google Scholar 

  5. Das K, Behera RN (2017) A survey on machine learning: concept, algorithms and applications. Int J Innovative Res Comput Commun Eng 5(2):2320–9801

    Google Scholar 

  6. Dey A (2016) Machine learning algorithms: a review. (IJCSIT) Int J Comput Sci Inf Technol 7(3):1174–1179

    Google Scholar 

  7. GarcĂ­a S, Luengo J, Herrera F (2015) Data preprocessing. In: Data mining. Springer International Publishing Switzerland

    Google Scholar 

  8. Alam S, Yao N (2018) The impact of preprocessing steps on the accuracy of machine learning algorithms in sentiment analysis. In: Computational and mathematical organization theory. Springer, Berlin

    Google Scholar 

  9. Xu S, Qian Y, Hu RQ (2017) A data-driven preprocessing scheme on anomaly detection in big data applications. In: 2017 IEEE conference on computer communications workshops (INFOCOM WKSHPS), Atlanta, GA, 2017, pp 814–819. https://doi.org/10.1109/infcomw.2017.8116481

  10. Kaur S, Jindal S (2016) A survey on machine learning algorithms. Int J Innovative Res Adv Eng (IJIRAE) 3(11):2349–2763

    Google Scholar 

  11. Khanum M, Mahboob T (2015) A survey on unsupervised machine learning algorithms for automation, classification and maintenance. Int J Comput Appl 119(13):0975–8887

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. S. Smitha Rao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Smitha Rao, M.S., Pallavi, M., Geetha, N. (2019). Conceptual Machine Learning Framework for Initial Data Analysis. In: Peng, SL., Dey, N., Bundele, M. (eds) Computing and Network Sustainability. Lecture Notes in Networks and Systems, vol 75. Springer, Singapore. https://doi.org/10.1007/978-981-13-7150-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-7150-9_6

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-7149-3

  • Online ISBN: 978-981-13-7150-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics